A Scala Corrections Library

Post on 08-Sep-2014

16.911 views 3 download

Tags:

description

A number of examples why Scala is in my rear view mirror, and a few ideas on how to improve the collections.

Transcript of A Scala Corrections Library

A Scala Corrections LibraryPaul Phillips

paulp@improving.org

Source: xkcd, of course.

– R. Buckminster Fuller

“When I'm working on a problem, I never think about beauty. I think only how to solve the problem.”

!“But when I have finished, if the solution is not

beautiful, I know it is wrong.”

(syntax highlighting donated by paulp)

“When I'm working on a problem, I never think about beauty. I think only how to

solve the problem.” !

“But when I have finished, if the solution is not beautiful, I know it is wrong.”

trait ParSeqViewLike[ +T, +Coll <: Parallel, +CollSeq, +This <: ParSeqView[T, Coll, CollSeq] with ParSeqViewLike[T, Coll, CollSeq, This, ThisSeq], +ThisSeq <: SeqView[T, CollSeq] with SeqViewLike[T, CollSeq, ThisSeq] ] extends GenSeqView[T, Coll] with GenSeqViewLike[T, Coll, This] with ParIterableView[T, Coll, CollSeq] with ParIterableViewLike[T, Coll, CollSeq, This, ThisSeq] with ParSeq[T] with ParSeqLike[T, This, ThisSeq]

– R. Buckminster Fuller

The Winding Stairway

• Five years on scala

• Rooting for scala/typesafe

• But I quit a dream job...

• ...because I lost faith

Credentials

Credentials, cont.

Should you care?• I offer my credentials only to bear witness to my

credibility

• I suspect I have written more scala code than anyone else, ever.

• What’s visible in compiler/library represents only a small fraction of it

Caveats• I ran out of time. Slides are rushed. Forgive me.

• Error messages and repl transcripts have been heavily trimmed for clarity on a slide

• This works counter to message when the point involves complexity or incomprehensibility

• So verbosify all compiler messages by a factor of three for a more accurate feel

My axe is dull• I have been pulling my punches

• This has left some thinking that I quit over technical esoterica: java compatibility, jvm limitations, intractable compiler challenges

• This is not accurate

Subtext, people

• Prevailing programmer culture frowns upon criticism of named individuals

• In this case that doesn’t leave much room for additional specificity

• All the relevant facts are available in the googles

Is Scala too complex?• I’ll field this one: YES

• Is anyone fooled by specious comparisons of language grammar size? Who cares?

• Half the time when someone hits a bug they can’t tell whether it is a bug in scala or the expected behavior

• That definitely includes me

• A meme is going around that scala is too complex

• Option A: Own it

• Option B: Address it

• Option C: Obscure it

Perceived Problem

Option C

// A fictional idealized version of the genuine method def map[B](f: (A) ⇒ B): Map[B] !// The laughably labeled "full" signature def map[B, That](f: ((A, B)) ⇒ B) (implicit bf: CanBuildFrom[Map[A, B], B, That]): That

Thus is born the “use case”

neither has any basis in reality!

// markers to distinguish Map's class type parameters scala> class K ; class V defined class K, V !scala> val host = typeOf[Map[K, V]] host: Type = Map[K,V] !scala> val method = host member TermName("map") method: Symbol = method map !// Correct signature for map has FOUR distinct identifiers scala> method defStringSeenAs (host memberType method) res0: String = \ def map[B, That](f: ((K, V)) => B) (implicit bf: CBF[Map[K,V],B,That]): That

the true name of map

• Now you’re thinking “use case thing is a bug, big deal, bugs get fixed.” Do they?

• Surely as soon as it is known the documentation spins these fabrications, it will be addressed? If not fixed, at least it’ll be marked as inaccurate? Something?

• Nope! To this day it’s the same. Your time is worthless.

map “map”

Signature

def map[B](f: A => B): F[B] def map[B, That](f: A => B)(implicit bf:

CanBuildFrom[Repr, B, That]): That

Elegance

Among the purest and most reusable

abstractions known to computing science

<—- Not this.

AdvantagesCan reason abstractly

about code

Can map a BitSet to a BitSet without typing

“toBitSet”

Spokespicture

Slightly Caricatured

// Fancy, we get a Bitset back! scala> BitSet(1, 2, 3) map (_.toString.toInt) res0: BitSet = BitSet(1, 2, 3) !// Except… scala> BitSet(1, 2, 3) map (_.toString) map (_.toInt) res1: SortedSet[Int] = TreeSet(1, 2, 3) !// Um… scala> (BitSet(1, 2, 3) map identity)(1) <console>:21: error: type mismatch; found : Int(1) required: scala.collection.generic.CanBuildFrom[scala.collection.immutable.BitSet,Int,?] (BitSet(1, 2, 3) map identity)(1) ^

The Bitset Gimmick

scala> def f[T](x: T) = (x, new Object) f: [T](x: T)(T, Object) !scala> SortedSet(1 to 10: _*) res0: SortedSet[Int] = TreeSet(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) !scala> SortedSet(1 to 10: _*) map (x => f(x)._1) res1: SortedSet[Int] = TreeSet(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) !scala> SortedSet(1 to 10: _*) map f map (_._1) res2: Set[Int] = Set(5, 10, 1, 6, 9, 2, 7, 3, 8, 4)

similarly

scala> val f: Int => Int = _ % 3 f: Int => Int = <function1> !scala> val g: Int => Int = _ => System.nanoTime % 1000000 toInt g: Int => Int = <function1> !scala> Set(3, 6, 9) map f map g res0: Set[Int] = Set(633000) !scala> Set(3, 6, 9) map (f andThen g) res1: Set[Int] = Set(305000, 307000, 308000)

and in a similar vein

Java Interop: the cruelest joke• It’s impossible to call scala’s map from java!

• See all the grotesque details at SI-4389

“I played with it until it got too tedious. I think the signatures work fine. What does not work is that the variances of CanBuildFrom cannot be modelled in Java, so types do not match. And it seems Java does not even let me override with a cast. So short answer: You can't call these

things from Java because instead of declaration side variance you have only a broken wildcard system.”

!— Martin Odersky

WONTFIX

• My time is running out and I can hear you saying…

• “Just give us a laundry list of collections issues”

• Okay, you asked for it (in my mind)

Lightning Round

• Implementation details infest everything

• And every detail is implementation-defined

• Capabilities should be designed around the laws of variance; instead variance checks are suppressed and key method contains is untyped

• Specificity rules render contravariance useless

• Implicit selection and type inference inextricably bound - so type inference is largely frozen because any change will break existing code

• Extreme pollution of base objects - all collections have “size: Int”, all Seqs have “apply”, etc.

• Bundling of concerns (e.g. invariant Set)

• Inheritance of implementation is the hammer for every nail…

• …yet “final” and “private”, critical for a hope of correctness under inheritance, are almost unknown

• Semantics discovered instead of designed

In Set(x) ++ Set(x), which x wins? !Can xs filter (_ => true) return xs? !Are defaults preserved across operations? Which operations? Is sortedness? Will views and Streams retain laziness when zipped?

assume the worst

scala> val m = Map(1 -> 2) withDefaultValue 10 m: Map[Int,Int] = Map(1 -> 2) !scala> m(1000) res0: Int = 10 !scala> (m map identity)(1000) <console>:9: error: type mismatch; found : Int(1000) required: CanBuildFrom[Map[Int,Int],(Int, Int),?] (m map identity)(1000) ^ !scala> m map identity apply 1000 java.util.NoSuchElementException: key not found: 1000 at MapLike$class.default(MapLike.scala:228)

xs map identity

% find collection -name ‘*.scala’ |\ xargs egrep asInstanceOf | wc -l

types are for suckers

556

scala> val xs: Set[Int] = (1 to 3).view.map(x => x)(breakOut) !java.lang.ClassCastException: SeqViewLike$$anon$3 cannot be cast to immutable.Set

How could 556 casts ever go wrong

!scala> Map[Int,Int]() withDefaultValue 123 res0: Map[Int,Int] = Map() !scala> res0 contains 55 res1: Boolean = false !scala> res0 get 55 res2: Option[Int] = None !scala> res0 apply 55 res3: Int = 123

get and applytrivially fall into disagreement

// WHY infer this utterly useless type? scala> List(1, 2) ::: List(3, 4.0) res0: List[AnyVal] = List(1, 2, 3.0, 4.0) !scala> PspList(1, 2) ::: PspList(3, 4.0) <console>:23: error: type mismatch; found : PspList[Int] required: PspList[Double]

Why is covariance such an object of worship? Types exist so we don’t have to live like this!

Type Inference + Variance ——————————

Abstracting over mutability

• An inherited implementation is ALWAYS wrong somewhere!!

• Example: how do you write "drop" so it's reusable?!

• In a mutable class, drop MUST NOT share, but in an immutable class, drop MUST share!

• Half the overrides in collections exist to stave off the incorrectness which looms above. This is nuts.!

• Not to mention “Map”, “Set”, etc. in three namespaces

% ack --no-filename 'def slice\(' src/library/ ! 1 override def slice(from: Int, until: Int): Iterator[A] = 2 def slice(from: Int, until: Int): Iterator[A] = { 3 def slice(from: Int, until: Int): Repr = 4 def slice(from: Int, until: Int): Repr = { 5 def slice(from: Int, until: Int): Repr = { 6 def slice(start: Int): PagedSeq[T] = slice(start, UndeterminedEnd) 7 def slice(unc_from: Int, unc_until: Int): Repr 8 override /*IterableLike*/ def slice(from: Int, until: Int): Vector[A] = 9 override /*TraversableLike*/ def slice(from: Int, until: Int): Repr = { 10 override def slice(_start: Int, _end: Int): PagedSeq[T] = { 11 override def slice(from1: Int, until1: Int): IterableSplitter[T] = 12 override def slice(from1: Int, until1: Int): SeqSplitter[T] = 13 override def slice(from: Int, until: Int) = { 14 override def slice(from: Int, until: Int) = { 15 override def slice(from: Int, until: Int): List[A] = { 16 override def slice(from: Int, until: Int): Repr = self.slice(from, until) 17 override def slice(from: Int, until: Int): Repr = { 18 override def slice(from: Int, until: Int): Stream[A] = { 19 override def slice(from: Int, until: Int): String = { 20 override def slice(from: Int, until: Int): This = 21 override def slice(from: Int, until: Int): This = 22 override def slice(from: Int, until: Int): Traversable[A] 23 override def slice(from: Int, until: Int): WrappedString = { 24 override def slice(unc_from: Int, unc_until: Int): Repr = {

How many ways are there to write ‘slice’ ?

scala.conflation• Every collection must have size

• Every sequence must have apply

• Every call to map includes a "builder factory"

• Every set must be invariant

• Everything must suffer universal equality

One of these expressions returns 2 and one returns never. Feeling lucky? !scala> (Stream from 1) zip (Stream from 1) map { case (x, y) => x + y } head !scala> (Stream from 1, Stream from 1).zipped map (_ + _) head

predictability

Two complementary ways to define Set[A]. Complementary - and NOT the same thing!

sets

Intensional Extensional

Specification Membership test Members

Variance Set[-A] Set[+A]

Defining Signature A => Boolean Iterable[A]

Size Unknowable Known

Duplicates(*) Meaningless Disallowed

scala> class xs[A] extends Set[A] error: class xs has 4 unimplemented members. ! // Intensional/extensional, conflated. // Any possibility of variance eliminated. def iterator: Iterator[A] def contains(elem: A): Boolean // What are these doing in the interface? // Why can I define a Seq without them? def -(elem: A): Set[A] def +(elem: A): Set[A]

What's going on here?

% git grep 'todo: also add' 607cb4250d SynchronizedMap.scala: // !!! todo: also add all other methods !% git grep 'todo: also add' origin/master SynchronizedMap.scala: // !!! todo: also add all other methods ! commit 607cb4250d Author: Martin Odersky <odersky@gmail.com> Date: Mon May 25 15:18:48 2009 (4 years, 8 months ago) ! added SynchronizedMap; changed Set.put to Set.add, implemented LinkedHashMap/Set more efficiently.

todo: also add all other methods

tyranny of the interface• Mandating "def size: Int" for all collections is the fast

track to Glacialville!

• Countless times have I fixed xs.size != 0

• Collections are both worlds: all performance/termination trap, no exploiting of size information!

• A universal size method must be SAFE and CHEAP

Psp Collections• So here is a little of what I would do differently

• I realized since agreeing to this talk that I may have to go cold turkey to escape scala’s orbit. It’s just too frustrating to use.

• Which means this may never go anywhere

• But you can have whatever gets done

trait Collections { type CC[+X] // the overarching container type (in scala: any covariant collection, e.g. List, Vector) type Min[+X] // least type constructor which can be reconstituted to CC[X] (scala: GenTraversableOnce) type Opt[+X] // the container type for optional results (in scala: Option) type CCPair[+X] // some representation of a divided CC[A] (at simplest, (CC[A], CC[A])) type ~>[-V1, +V2] // some means of composing operations (at simplest, Function1) ! type Iso[A] = CC[A] ~> CC[A] // e.g. filter, take, drop, reverse, etc. type Map[-A, +B] = CC[A] ~> CC[B] // e.g. map, collect type FlatMap[-A, +B] = CC[A] ~> Min[B] // e.g. flatMap type Grouped[A, DD[X]] = CC[A] ~> CC[DD[A]] // e.g. sliding type Fold[-A, +R] = CC[A] ~> R // e.g. fold, but also subsumes all operations on CC[A] type Flatten[A] = CC[Min[A]] ~> CC[A] // e.g. flatten type Build[A] = Min[A] ~> CC[A] // for use in e.g. sliding, flatMap type Pure[A] = A ~> CC[A] // we may not need ! trait Relations[A] { type MapTo[+B] = Map[A, B] // an alias incorporating the known A type FoldTo[+R] = Fold[A, R] // another one type This = CC[A] // the CC[A] under consideration type Twosome = CCPair[A] // a (CC[A], CC[A]) representation type Self = Iso[A] // a.k.a. CC[A] => CC[A], e.g. tail, filter, reverse type Select = FoldTo[A] // a.k.a. CC[A] => A, e.g. head, reduce, max type Find = FoldTo[Opt[A]] // a.k.a. CC[A] => Opt[A], e.g. find type Split = FoldTo[Twosome] // a.k.a. CC[A] => (CC[A], CC[A]), e.g. partition, span } }

Conceptual Integrity

“Do not multiply entities unnecessarily”

• mutable / immutable

• Seq / Set / Map

• parallel / sequential

• view / regular

24 Combinations!

Surface Area Reduced 96%• A Set is a Seq without duplicates.

• A Map is a Set paired with a function K => V.

• A mutable collection has nothing useful in common with an immutable collection. Write your own mutable collections.

• If we can’t get sequential collections right, we have no hope of parallel collections. Write your own parallel collections.

• “Views” should be how it always works.

scala> def f(xs: Iterable[Int]) = xs.size f: (xs: Seq[Int])Int !// O(1) scala> f(Set(1)) res0: Int = 1 !// O(n) scala> f(List(1)) res1: Int = 1 !// O(NOES) scala> f(Stream continually 1) <ctrl-C>

predictability: size matters

SizeInfo / \ Atomic Bounded / \ Infinite Precise

Asking the right question

scala> val xs = Foreach from BigInt(1) xs: psp.core.Foreach[BigInt] = unfold(1)(<function1>) !scala> xs.size <console>:22: error: value size is not a member of psp.core.Foreach[BigInt] xs.size ^ !scala> xs.sizeInfo res0: psp.core.SizeInfo = <inf>

Don’t ask unanswerable questions (Unless you enjoy hearing lies)

scala> List(1, 2, 3) contains "1" res0: Boolean = false !scala> PspList(1, 2, 3) contains "1" <console>:23: error: type mismatch; found : String("1") required: Int PspList(1, 2, 3) contains "1" ^

the joy of the invariant leaf

scala> "abc" map (_.toInt.toChar) res1: String = abc !scala> "abc" map (_.toInt) map (_.toChar) res2: IndexedSeq[Char] = Vector(a, b, c) !// psp to the rescue scala> "abc".m map (_.toInt) map (_.toChar) res3: psp.core.View[String,Char] = view of abc !scala> res3.force res4: String = abc

HEY! MAP NEED NOT BE RUINED!