Map, Flatmap and Reduce are Your New Best Friends: Simpler Collections, Concurrency, and Big Data...

download Map, Flatmap and Reduce are Your New Best Friends: Simpler Collections, Concurrency, and Big Data (#oscon)

of 67

  • date post

    27-Aug-2014
  • Category

    Software

  • view

    779
  • download

    0

Embed Size (px)

description

Higher-order functions such as map(), flatmap(), filter() and reduce() have their origins in mathematics and ancient functional programming languages such as Lisp. But today they have entered the mainstream and are available in languages such as JavaScript, Scala and Java 8. They are well on their way to becoming an essential part of every developer’s toolbox. In this talk you will learn how these and other higher-order functions enable you to write simple, expressive and concise code that solve problems in a diverse set of domains. We will describe how you use them to process collections in Java and Scala. You will learn how functional Futures and Rx (Reactive Extensions) Observables simplify concurrent code. We will even talk about how to write big data applications in a functional style using libraries such as Scalding.

Transcript of Map, Flatmap and Reduce are Your New Best Friends: Simpler Collections, Concurrency, and Big Data...

  • Map(), atMap() and reduce() are your new best friends: simpler collections, concurrency, and big data Chris Richardson Author of POJOs in Action Founder of the original CloudFoundry.com @crichardson chris@chrisrichardson.net http://plainoldobjects.com
  • @crichardson Presentation goal How functional programming simplies your code Show that map(), atMap() and reduce() are remarkably versatile functions
  • @crichardson About Chris
  • @crichardson About Chris Founder of a buzzword compliant (stealthy, social, mobile, big data, machine learning, ...) startup Consultant helping organizations improve how they architect and deploy applications using cloud, micro services, polyglot applications, NoSQL, ...
  • @crichardson Agenda Why functional programming? Simplifying collection processing Simplifying concurrency with Futures and Rx Observables Tackling big data problems with functional programming
  • @crichardson Functional programming is a programming paradigm Functions are the building blocks of the application Best done in a functional programming language
  • @crichardson Functions as rst class citizens Assign functions to variables Store functions in elds Use and write higher-order functions: Pass functions as arguments Return functions as values
  • @crichardson Avoids mutable state Use: Immutable data structures Single assignment variables Some functional languages such as Haskell dont allow side-effects
  • @crichardson Why functional programming? "the highest goal of programming- language design to enable good ideas to be elegantly expressed" http://en.wikipedia.org/wiki/Tony_Hoare
  • @crichardson Why functional programming? More expressive More intuitive - declarative code matches problem denition Functional code is usually much more composable Immutable state: Less error-prone Easy parallelization and concurrency But be pragmatic
  • @crichardson An ancient idea that has recently become popular
  • @crichardson Mathematical foundation: -calculus Introduced by Alonzo Church in the 1930s
  • @crichardson Lisp = an early functional language invented in 1958 http://en.wikipedia.org/wiki/Lisp_(programming_language) 1940 1950 1960 1970 1980 1990 2000 2010 garbage collection dynamic typing self-hosting compiler tree data structures (defun factorial (n) (if ( x * x x -> { for (int i = 2; i < Math.sqrt(x); i = i + 1) { if (x % i == 0) return false; } return true; }; (x, y) -> x * x + y * y An instance of an anonymous inner class that implements a functional interface (kinda)
  • @crichardson Agenda Why functional programming? Simplifying collection processing Simplifying concurrency with Futures and Rx Observables Tackling big data problems with functional programming
  • @crichardson Lots of application code = collection processing: Mapping, ltering, and reducing
  • @crichardson Social network example public class Person { enum Gender { MALE, FEMALE } private Name name; private LocalDate birthday; private Gender gender; private Hometown hometown; private Set friends = new HashSet(); .... public class Friend { private Person friend; private LocalDate becameFriends; ... } public class SocialNetwork { private Set people; ...
  • @crichardson Typical iterative code - e.g. ltering public class SocialNetwork { private Set people; ... public Set lonelyPeople() { Set result = new HashSet(); for (Person p : people) { if (p.getFriends().isEmpty()) result.add(p); } return result; } Declare result variable Modify result Return result Iterate
  • @crichardson Problems with this style of programming Low level Imperative (how to do it) NOT declarative (what to do) Verbose Mutable variables are potentially error prone Difcult to parallelize
  • @crichardson Java 8 streams to the rescue A sequence of elements Wrapper around a collection (and other types: e.g. JarFile.stream(), Files.lines()) Streams can also be innite Provides a functional/lambda-based API for transforming, ltering and aggregating elements Much simpler, cleaner and declarative code
  • @crichardson public class SocialNetwork { private Set people; ... public Set peopleWithNoFriends() { Set result = new HashSet(); for (Person p : people) { if (p.getFriends().isEmpty()) result.add(p); } return result; } Using Java 8 streams - ltering public class SocialNetwork { private Set people; ... public Set lonelyPeople() { return people.stream() .filter(p -> p.getFriends().isEmpty()) .collect(Collectors.toSet()); } predicate lambda expression
  • @crichardson The lter() function s1 a b c d e ... s2 a c d ... s2 = s1.lter(f) Elements that satisfy predicate f
  • @crichardson Using Java 8 streams - mapping class Person .. private Set friends = ...; public Set hometownsOfFriends() { return friends.stream() .map(f -> f.getPerson().getHometown()) .collect(Collectors.toSet()); }
  • @crichardson The map() function s1 a b c d e ... s2 f(a) f(b) f(c) f(d) f(e) ... s2 = s1.map(f)
  • @crichardson Using Java 8 streams - friend of friends using atMap class Person .. public Set friendOfFriends() { return friends.stream() .flatMap(friend -> friend.getPerson().friends.stream()) .map(Friend::getPerson) .filter(f -> f != this) .collect(Collectors.toSet()); } maps and attens
  • @crichardson The atMap() function s1 a b ... s2 f(a)0 f(a)1 f(b)0 f(b)1 f(b)2 ... s2 = s1.atMap(f)
  • @crichardson Using Java 8 streams - reducing public class SocialNetwork { private Set people; ... public long averageNumberOfFriends() { return people.stream() .map ( p -> p.getFriends().size() ) .reduce(0, (x, y) -> x + y) / people.size(); } int x = 0; for (int y : inputStream) x = x + y return x;
  • @crichardson The reduce() function s1 a b c d e ... x = s1.reduce(initial, f) f(f(f(f(f(f(initial, a), b), c), d), e), ...)
  • @crichardson Adopting FP with Java 8 is straightforward Simply start using streams and lambdas Eclipse can refactor anonymous inner classes to lambdas
  • @crichardson Agenda Why functional programming? Simplifying collection processing Simplifying concurrency with Futures and Rx Observables Tackling big data problems with functional programming
  • @crichardson Lets imagine that you are writing code to display the products in a users wish list
  • @crichardson The need for concurrency Step #1 Web service request to get the user prole including wish list (list of product Ids) Step #2 For each productId: web service request to get product info But Getting products sequentially terrible response time Need fetch productInfo concurrently Composing sequential + scatter/gather-style operations is very common
  • @crichardson Futures are a great abstraction for composing concurrent operations http://en.wikipedia.org/wiki/Futures_and_promises
  • @crichardson Worker thread or event- driven code Main thread Composition with futures Outcome Future 2 Client get Asynchronous operation 2 set initiates Asynchronous operation 1 Outcome Future 1 get set
  • @crichardson But composition with basic futures is difcult Java 7 future.get([timeout]): Blocking API client blocks thread Difcult to compose multiple concurrent operations Futures with callbacks: e.g. Guava ListenableFutures, Spring 4 ListenableFuture Attach callbacks to all futures and asynchronously consume outcomes But callback-based code = messy code See http://techblog.netix.com/2013/02/rxjava-netix-api.html We need functional futures!
  • @crichardson Functional futures - Scala, Java 8 CompletableFuture def asyncPlus(x : Int, y : Int) : Future[Int] = ... x + y ... val future2 = asyncPlus(4, 5).map{ _ * 3 } assertEquals(27, Await.result(future2, 1 second)) Asynchronously transforms future def asyncSquare(x : Int) : Future[Int] = ... x * x ... val f2 = asyncPlus(5, 8).flatMap { x => asyncSquare(x) } assertEquals(169, Aw