Clojure Reducers / clj-syd Aug 2012

108
Reducers A library and model for collection processing in Clojure Leonardo Borges @leonardo_borges http://www.leonardoborges.com http://www.thoughtworks.com Thursday, 30 August 12

description

Talk given at the Sydney Clojure User group, August 2012

Transcript of Clojure Reducers / clj-syd Aug 2012

Page 1: Clojure Reducers / clj-syd Aug 2012

ReducersA library and model for collection processing in Clojure

Leonardo Borges@leonardo_borgeshttp://www.leonardoborges.comhttp://www.thoughtworks.com

Thursday, 30 August 12

Page 2: Clojure Reducers / clj-syd Aug 2012

ReducersA library and model for collection processing in Clojure

Leonardo Borges@leonardo_borgeshttp://www.leonardoborges.comhttp://www.thoughtworks.com

...in 20 mins or le

ss

Thursday, 30 August 12

Page 3: Clojure Reducers / clj-syd Aug 2012

Reducers huh? Here’s the gist

Thursday, 30 August 12

Page 4: Clojure Reducers / clj-syd Aug 2012

You get parallel versions of reduce, map and filter

Reducers huh? Here’s the gist

Thursday, 30 August 12

Page 5: Clojure Reducers / clj-syd Aug 2012

You get parallel versions of reduce, map and filter

Reducers huh? Here’s the gist

Ta-da! I’m done!

Thursday, 30 August 12

Page 6: Clojure Reducers / clj-syd Aug 2012

You get parallel versions of reduce, map and filter

Reducers huh? Here’s the gist

Ta-da! I’m done!

and well under my 20 min limit :)

Thursday, 30 August 12

Page 7: Clojure Reducers / clj-syd Aug 2012

Alright, alright I’m kidding

Thursday, 30 August 12

Page 8: Clojure Reducers / clj-syd Aug 2012

How do reducers make parallelism possible?

Thursday, 30 August 12

Page 9: Clojure Reducers / clj-syd Aug 2012

• JVM’s Fork/Join framework• Reduction Transformers

How do reducers make parallelism possible?

Thursday, 30 August 12

Page 10: Clojure Reducers / clj-syd Aug 2012

Java requirements

• Fork/Join framework• Java 7 [1] or• Java 6 + the JSR166 jar [2]

Clojure requirements

• 1.5.0-* (this is still MASTER on github [3] as of 30/08/2012)

[1] - http://jdk7.java.net/[2] - http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166.jar[3] - https://github.com/clojure/clojure

Before we start - this is bleeding edge stuff

Thursday, 30 August 12

Page 11: Clojure Reducers / clj-syd Aug 2012

The Fork/Join Framework

Thursday, 30 August 12

Page 12: Clojure Reducers / clj-syd Aug 2012

•Based on divide and conquer

The Fork/Join Framework

Thursday, 30 August 12

Page 13: Clojure Reducers / clj-syd Aug 2012

•Based on divide and conquer•Work stealing algorithm

The Fork/Join Framework

Thursday, 30 August 12

Page 14: Clojure Reducers / clj-syd Aug 2012

•Based on divide and conquer•Work stealing algorithm•Uses deques - double ended queues.

The Fork/Join Framework

Thursday, 30 August 12

Page 15: Clojure Reducers / clj-syd Aug 2012

•Based on divide and conquer•Work stealing algorithm•Uses deques - double ended queues.•Progressively divides the workload into tasks, up to a threshold

The Fork/Join Framework

Thursday, 30 August 12

Page 16: Clojure Reducers / clj-syd Aug 2012

•Based on divide and conquer•Work stealing algorithm•Uses deques - double ended queues.•Progressively divides the workload into tasks, up to a threshold•Once it finished one task, it pops another one form its deque

The Fork/Join Framework

Thursday, 30 August 12

Page 17: Clojure Reducers / clj-syd Aug 2012

•Based on divide and conquer•Work stealing algorithm•Uses deques - double ended queues.•Progressively divides the workload into tasks, up to a threshold•Once it finished one task, it pops another one form its deque•After at least two tasks have finished, results can be combined/joined

The Fork/Join Framework

Thursday, 30 August 12

Page 18: Clojure Reducers / clj-syd Aug 2012

•Based on divide and conquer•Work stealing algorithm•Uses deques - double ended queues.•Progressively divides the workload into tasks, up to a threshold•Once it finished one task, it pops another one form its deque•After at least two tasks have finished, results can be combined/joined•Idle workers can pop tasks from the deques of workers which fall behind

The Fork/Join Framework

Thursday, 30 August 12

Page 19: Clojure Reducers / clj-syd Aug 2012

Text is boring

Thursday, 30 August 12

Page 20: Clojure Reducers / clj-syd Aug 2012

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 21: Clojure Reducers / clj-syd Aug 2012

Fork/Join algorithm - simplified view

Workload is put in “deques”

Thursday, 30 August 12

Page 22: Clojure Reducers / clj-syd Aug 2012

Fork/Join algorithm - simplified view

...and progressively halved

Thursday, 30 August 12

Page 23: Clojure Reducers / clj-syd Aug 2012

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 24: Clojure Reducers / clj-syd Aug 2012

Fork/Join algorithm - simplified view

...up to a configured threshold

Thursday, 30 August 12

Page 25: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 26: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 27: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Combine

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 28: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 29: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 30: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Combine

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 31: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Combine Combine

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 32: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Combine

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 33: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 34: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Combine

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 35: Clojure Reducers / clj-syd Aug 2012

Worker 1

Combine

Worker 2

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 36: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 37: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Fork/Join algorithm - simplified view

Idle workers can “steal” items from other workersThursday, 30 August 12

Page 38: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Combine Combine

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 39: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Combine

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 40: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Combine

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 41: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Combine

Fork/Join algorithm - simplified view

Thursday, 30 August 12

Page 42: Clojure Reducers / clj-syd Aug 2012

Worker 1 Worker 2

Fork/Join algorithm - simplified view

Final result

Thursday, 30 August 12

Page 43: Clojure Reducers / clj-syd Aug 2012

Let’s talk about Reducers

Thursday, 30 August 12

Page 44: Clojure Reducers / clj-syd Aug 2012

Let’s talk about Reducers

Motivations

• Performance• via less allocation• via parallelism (leverage Fork/Join)

Thursday, 30 August 12

Page 45: Clojure Reducers / clj-syd Aug 2012

Let’s talk about Reducers

Motivations

• Performance• via less allocation• via parallelism (leverage Fork/Join)

Issues

• Lists and Seqs are sequential• map / filter implies order

Thursday, 30 August 12

Page 46: Clojure Reducers / clj-syd Aug 2012

A closer look at what map does

;; a naive map implementation(defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '()))

Thursday, 30 August 12

Page 47: Clojure Reducers / clj-syd Aug 2012

A closer look at what map does

• Recursion

;; a naive map implementation(defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '()))

Thursday, 30 August 12

Page 48: Clojure Reducers / clj-syd Aug 2012

A closer look at what map does

• Recursion• Order

;; a naive map implementation(defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '()))

Thursday, 30 August 12

Page 49: Clojure Reducers / clj-syd Aug 2012

A closer look at what map does

• Recursion• Order• Laziness (not shown)

;; a naive map implementation(defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '()))

Thursday, 30 August 12

Page 50: Clojure Reducers / clj-syd Aug 2012

A closer look at what map does

• Recursion• Order• Laziness (not shown)• Consumes List

;; a naive map implementation(defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '()))

Thursday, 30 August 12

Page 51: Clojure Reducers / clj-syd Aug 2012

A closer look at what map does

• Recursion• Order• Laziness (not shown)• Consumes List• Builds List

;; a naive map implementation(defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '()))

Thursday, 30 August 12

Page 52: Clojure Reducers / clj-syd Aug 2012

A closer look at what map does

• Recursion• Order• Laziness (not shown)• Consumes List• Builds List

;; a naive map implementation(defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '()))

Oh, and it also applies the functionto each item before putting the result into the new list

Thursday, 30 August 12

Page 53: Clojure Reducers / clj-syd Aug 2012

A closer look at what map does

• Recursion• Order• Laziness (not shown)• Consumes List• Builds List

;; a naive map implementation(defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '()))

Oh, and it also applies the functionto each item before putting the result into the new list

This is what mapping means!

Thursday, 30 August 12

Page 54: Clojure Reducers / clj-syd Aug 2012

Reduction Transformers

Thursday, 30 August 12

Page 55: Clojure Reducers / clj-syd Aug 2012

Reduction Transformers

• Idea is to build map / filter on top of reduce to break from sequentiality

Thursday, 30 August 12

Page 56: Clojure Reducers / clj-syd Aug 2012

Reduction Transformers

• Idea is to build map / filter on top of reduce to break from sequentiality• map / filter then builds nothing and consumes nothing

Thursday, 30 August 12

Page 57: Clojure Reducers / clj-syd Aug 2012

Reduction Transformers

• Idea is to build map / filter on top of reduce to break from sequentiality• map / filter then builds nothing and consumes nothing• It changes what reduce means to the collection by transforming the reducing functions

Thursday, 30 August 12

Page 58: Clojure Reducers / clj-syd Aug 2012

What map is really all about

(defn mapping [f] (fn [f1] (fn [result input] (f1 result (f input)))))

Thursday, 30 August 12

Page 59: Clojure Reducers / clj-syd Aug 2012

But wait! If map doesn’t consume the list any longer, who does?

• reduce does!• Since Clojure 1.4 reduce lets the collection reduce itself (through the CollReduce / CollFold protocols)• Think of what this means for tree-like structures such as vectors• This is key to leveraging the Fork/Join framework

Thursday, 30 August 12

Page 60: Clojure Reducers / clj-syd Aug 2012

Now we can use mapping to create reducing functions

(reduce ((mapping inc) +) 0 [1 2 3 4]) ;; 14

Thursday, 30 August 12

Page 61: Clojure Reducers / clj-syd Aug 2012

Now we can use mapping to create reducing functions

(reduce ((mapping inc) +) 0 [1 2 3 4]) ;; 14

(fn [result input] (+ result (inc input)))

Thursday, 30 August 12

Page 62: Clojure Reducers / clj-syd Aug 2012

Now we can use mapping to create reducing functions

(reduce ((mapping inc) conj) [] [1 2 3 4]);; [2 3 4 5]

Thursday, 30 August 12

Page 63: Clojure Reducers / clj-syd Aug 2012

Now we can use mapping to create reducing functions

(reduce ((mapping inc) conj) [] [1 2 3 4]);; [2 3 4 5]

(fn [result input] (conj result (inc input)))

Thursday, 30 August 12

Page 64: Clojure Reducers / clj-syd Aug 2012

Now we can use mapping to create reducing functions

(reduce ((mapping inc) conj) [] [1 2 3 4]);; [2 3 4 5]

(fn [result input] (conj result (inc input)))

But it feels awkward to use it in this form

Thursday, 30 August 12

Page 65: Clojure Reducers / clj-syd Aug 2012

What do we have so far?

• Performance has been improved due to less allocations• No intermediary lists need to be built (see Haskell’s StreamFusion [4])• However reduce is still sequential

[4] - http://bit.ly/streamFusionThursday, 30 August 12

Page 66: Clojure Reducers / clj-syd Aug 2012

Enters fold

Thursday, 30 August 12

Page 67: Clojure Reducers / clj-syd Aug 2012

Enters fold

• Takes the sequentiality out or foldl, foldr and reduce

Thursday, 30 August 12

Page 68: Clojure Reducers / clj-syd Aug 2012

Enters fold

• Takes the sequentiality out or foldl, foldr and reduce• Potentially parallel (fallsback to standard reduce otherwise)

Thursday, 30 August 12

Page 69: Clojure Reducers / clj-syd Aug 2012

Enters fold

• Takes the sequentiality out or foldl, foldr and reduce• Potentially parallel (fallsback to standard reduce otherwise)• Reduce/Combine strategy (think Fork/Join Framework)

Thursday, 30 August 12

Page 70: Clojure Reducers / clj-syd Aug 2012

Enters fold

• Takes the sequentiality out or foldl, foldr and reduce• Potentially parallel (fallsback to standard reduce otherwise)• Reduce/Combine strategy (think Fork/Join Framework)• Segments the collection

Thursday, 30 August 12

Page 71: Clojure Reducers / clj-syd Aug 2012

Enters fold

• Takes the sequentiality out or foldl, foldr and reduce• Potentially parallel (fallsback to standard reduce otherwise)• Reduce/Combine strategy (think Fork/Join Framework)• Segments the collection• Runs multiple reduces in parallel

Thursday, 30 August 12

Page 72: Clojure Reducers / clj-syd Aug 2012

Enters fold

• Takes the sequentiality out or foldl, foldr and reduce• Potentially parallel (fallsback to standard reduce otherwise)• Reduce/Combine strategy (think Fork/Join Framework)• Segments the collection• Runs multiple reduces in parallel• Uses a combining function to join/reduce results

Thursday, 30 August 12

Page 73: Clojure Reducers / clj-syd Aug 2012

Enters fold

• Takes the sequentiality out or foldl, foldr and reduce• Potentially parallel (fallsback to standard reduce otherwise)• Reduce/Combine strategy (think Fork/Join Framework)• Segments the collection• Runs multiple reduces in parallel• Uses a combining function to join/reduce results

(defn fold [combinef reducef coll] ...)

Thursday, 30 August 12

Page 74: Clojure Reducers / clj-syd Aug 2012

The combining function is a monoid

• A binary function with an identity element• All the following functions are equivalent monoids

Thursday, 30 August 12

Page 75: Clojure Reducers / clj-syd Aug 2012

The combining function is a monoid

• A binary function with an identity element• All the following functions are equivalent monoids

+(+ 2 3) ; 5(+) ; 0

Thursday, 30 August 12

Page 76: Clojure Reducers / clj-syd Aug 2012

The combining function is a monoid

• A binary function with an identity element• All the following functions are equivalent monoids

(defn my-+ ([] 0) ([a b] (+ a b)))

(my-+ 2 3) ; 5(my-+) ; 0

Thursday, 30 August 12

Page 77: Clojure Reducers / clj-syd Aug 2012

The combining function is a monoid

• A binary function with an identity element• All the following functions are equivalent monoids

(require ‘[clojure.core.reducers :as r])

(def my-+ (r/monoid + (fn [] 0)))

(my-+ 2 3) ; 5(my-+) ; 0

Thursday, 30 August 12

Page 78: Clojure Reducers / clj-syd Aug 2012

fold by examples

;; all examples assume the reducers library is available as r(ns reducers-playground.core (:require [clojure.core.reducers :as r]))

Thursday, 30 August 12

Page 79: Clojure Reducers / clj-syd Aug 2012

fold by examples:increment all even positive integers up to 10 million

and add them all up

Thursday, 30 August 12

Page 80: Clojure Reducers / clj-syd Aug 2012

fold by examples:increment all even positive integers up to 10 million

and add them all up;; these were taken from Rich’s reducers talk

Thursday, 30 August 12

Page 81: Clojure Reducers / clj-syd Aug 2012

fold by examples:increment all even positive integers up to 10 million

and add them all up;; these were taken from Rich’s reducers talk(def my-vector (into [] (range 10000000)))

Thursday, 30 August 12

Page 82: Clojure Reducers / clj-syd Aug 2012

fold by examples:increment all even positive integers up to 10 million

and add them all up;; these were taken from Rich’s reducers talk(def my-vector (into [] (range 10000000)))

(time (reduce + (map inc (filter even? my-vector))))

Thursday, 30 August 12

Page 83: Clojure Reducers / clj-syd Aug 2012

fold by examples:increment all even positive integers up to 10 million

and add them all up;; these were taken from Rich’s reducers talk(def my-vector (into [] (range 10000000)))

(time (reduce + (map inc (filter even? my-vector)))) ;; 500msecs

Thursday, 30 August 12

Page 84: Clojure Reducers / clj-syd Aug 2012

fold by examples:increment all even positive integers up to 10 million

and add them all up;; these were taken from Rich’s reducers talk(def my-vector (into [] (range 10000000)))

(time (reduce + (map inc (filter even? my-vector)))) ;; 500msecs

(time (reduce + (r/map inc (r/filter even? my-vector))))

Thursday, 30 August 12

Page 85: Clojure Reducers / clj-syd Aug 2012

fold by examples:increment all even positive integers up to 10 million

and add them all up;; these were taken from Rich’s reducers talk(def my-vector (into [] (range 10000000)))

(time (reduce + (map inc (filter even? my-vector)))) ;; 500msecs

(time (reduce + (r/map inc (r/filter even? my-vector)))) ;; 260msecs

Thursday, 30 August 12

Page 86: Clojure Reducers / clj-syd Aug 2012

fold by examples:increment all even positive integers up to 10 million

and add them all up;; these were taken from Rich’s reducers talk(def my-vector (into [] (range 10000000)))

(time (reduce + (map inc (filter even? my-vector)))) ;; 500msecs

(time (reduce + (r/map inc (r/filter even? my-vector)))) ;; 260msecs

(time (r/fold + (r/map inc (r/filter even? my-vector))))

Thursday, 30 August 12

Page 87: Clojure Reducers / clj-syd Aug 2012

fold by examples:increment all even positive integers up to 10 million

and add them all up;; these were taken from Rich’s reducers talk(def my-vector (into [] (range 10000000)))

(time (reduce + (map inc (filter even? my-vector)))) ;; 500msecs

(time (reduce + (r/map inc (r/filter even? my-vector)))) ;; 260msecs

(time (r/fold + (r/map inc (r/filter even? my-vector)))) ;; 130msecs

Thursday, 30 August 12

Page 88: Clojure Reducers / clj-syd Aug 2012

fold by examples:standard word count

(def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

(defn count-words [text] (reduce (fn [memo word] (assoc memo word (inc (get memo word 0)))) {} (map #(.toLowerCase %) (into [] (re-seq #"\w+" text)))))

Thursday, 30 August 12

Page 89: Clojure Reducers / clj-syd Aug 2012

fold by examples:standard word count

(def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

(defn count-words [text] (reduce (fn [memo word] (assoc memo word (inc (get memo word 0)))) {} (map #(.toLowerCase %) (into [] (re-seq #"\w+" text)))))

(time (count-words wiki-dump)) ;; 45 secs

Thursday, 30 August 12

Page 90: Clojure Reducers / clj-syd Aug 2012

fold by examples:parallel word count

(def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

(defn p-count-words [text] (r/fold (r/monoid (partial merge-with +) hash-map) (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"\w+" text)))))

Thursday, 30 August 12

Page 91: Clojure Reducers / clj-syd Aug 2012

fold by examples:parallel word count

(def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

(defn p-count-words [text] (r/fold (r/monoid (partial merge-with +) hash-map) (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"\w+" text)))))

Combining fn

Thursday, 30 August 12

Page 92: Clojure Reducers / clj-syd Aug 2012

fold by examples:parallel word count

(def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

(defn p-count-words [text] (r/fold (r/monoid (partial merge-with +) hash-map) (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"\w+" text)))))

Will be called at the leaves to merge the partial computations

Thursday, 30 August 12

Page 93: Clojure Reducers / clj-syd Aug 2012

fold by examples:parallel word count

(def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

(defn p-count-words [text] (r/fold (r/monoid (partial merge-with +) hash-map) (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"\w+" text)))))

Will be called with no arguments to provide a seed value

Thursday, 30 August 12

Page 94: Clojure Reducers / clj-syd Aug 2012

fold by examples:parallel word count

(def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

(defn p-count-words [text] (r/fold (r/monoid (partial merge-with +) hash-map) (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"\w+" text)))))

Thursday, 30 August 12

Page 95: Clojure Reducers / clj-syd Aug 2012

fold by examples:parallel word count

(time (p-count-words wiki-dump)) ;; 30 secs

(def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

(defn p-count-words [text] (r/fold (r/monoid (partial merge-with +) hash-map) (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"\w+" text)))))

Thursday, 30 August 12

Page 96: Clojure Reducers / clj-syd Aug 2012

fold by examples:Load 100k records into PostgreSQL

(def records (into [] (line-seq (BufferedReader. (FileReader. "dump.txt")))))

Thursday, 30 August 12

Page 97: Clojure Reducers / clj-syd Aug 2012

fold by examples:Load 100k records into PostgreSQL

(time (doseq [record records] (let [tokens (clojure.string/split record #"\t" )] (insert users/users (values { :account-id (nth tokens 0) ... })))))

Thursday, 30 August 12

Page 98: Clojure Reducers / clj-syd Aug 2012

fold by examples:Load 100k records into PostgreSQL

(time (doseq [record records] (let [tokens (clojure.string/split record #"\t" )] (insert users/users (values { :account-id (nth tokens 0) ... })))))

;; 90 secsThursday, 30 August 12

Page 99: Clojure Reducers / clj-syd Aug 2012

fold by examples:Load 100k records into PostgreSQL in parallel

(time (r/fold + (r/map (fn [record] (let [tokens (clojure.string/split record #"\t" )] (do (insert users/users (values { :account-id (nth tokens 0) ... })) 1))) records)))

Thursday, 30 August 12

Page 100: Clojure Reducers / clj-syd Aug 2012

fold by examples:Load 100k records into PostgreSQL in parallel

;; 50 secs

(time (r/fold + (r/map (fn [record] (let [tokens (clojure.string/split record #"\t" )] (do (insert users/users (values { :account-id (nth tokens 0) ... })) 1))) records)))

Thursday, 30 August 12

Page 101: Clojure Reducers / clj-syd Aug 2012

When to use it

Thursday, 30 August 12

Page 102: Clojure Reducers / clj-syd Aug 2012

When to use it

• Exploring decision trees

Thursday, 30 August 12

Page 103: Clojure Reducers / clj-syd Aug 2012

When to use it

• Exploring decision trees• Image processing

Thursday, 30 August 12

Page 104: Clojure Reducers / clj-syd Aug 2012

When to use it

• Exploring decision trees• Image processing• As a building block for bigger, distributed systems such as Datomic and Cascalog (maybe around parallel agregators)

Thursday, 30 August 12

Page 105: Clojure Reducers / clj-syd Aug 2012

When to use it

• Exploring decision trees• Image processing• As a building block for bigger, distributed systems such as Datomic and Cascalog (maybe around parallel agregators)• Basically any list intensive program

Thursday, 30 August 12

Page 106: Clojure Reducers / clj-syd Aug 2012

When to use it

• Exploring decision trees• Image processing• As a building block for bigger, distributed systems such as Datomic and Cascalog (maybe around parallel agregators)• Basically any list intensive program

But the tools are available to anyone so be creative!

Thursday, 30 August 12

Page 107: Clojure Reducers / clj-syd Aug 2012

Resources

• The Anatomy of a Reducer - http://bit.ly/anatomyReducers• Rich’s announcement post on Reducers - http://bit.ly/reducersANN• Rich Hickey - Reducers - EuroClojure 2012 - http://bit.ly/reducersVideo (this presentation was heavily inspired by this video)• The Source on github - http://bit.ly/reducersCore

Leonardo Borges@leonardo_borgeshttp://www.leonardoborges.comhttp://www.thoughtworks.com

Thursday, 30 August 12

Page 108: Clojure Reducers / clj-syd Aug 2012

Thanks!

Questions?

Leonardo Borges@leonardo_borges

http://www.leonardoborges.comhttp://www.thoughtworks.com

Thursday, 30 August 12