SlideShare a Scribd company logo
Reducers
                         A library and model for collection processing in Clojure




                                                                             Leonardo Borges
                                                                             @leonardo_borges
                                                                             http://www.leonardoborges.com
                                                                             http://www.thoughtworks.com
Thursday, 30 August 12
Reducers
                         A library and model for collection processing in Clojure


                                                                                 less
                                                                              or
                                                                     m i   ns
                                                             in 20
                                                       ...                         Leonardo Borges
                                                                                   @leonardo_borges
                                                                                   http://www.leonardoborges.com
                                                                                   http://www.thoughtworks.com
Thursday, 30 August 12
Reducers huh? Here’s the gist




Thursday, 30 August 12
Reducers huh? Here’s the gist




                         You get parallel versions of reduce, map and filter




Thursday, 30 August 12
Reducers huh? Here’s the gist




                         You get parallel versions of reduce, map and filter



                                            Ta-da! I’m done!



Thursday, 30 August 12
Reducers huh? Here’s the gist




                         You get parallel versions of reduce, map and filter



                                             Ta-da! I’m done!

                                     and well under my 20 min limit :)

Thursday, 30 August 12
Alright, alright I’m kidding




Thursday, 30 August 12
How do reducers make parallelism possible?




Thursday, 30 August 12
How do reducers make parallelism possible?



                                   • JVM’s Fork/Join framework
                                   • Reduction Transformers




Thursday, 30 August 12
Before we start - this is bleeding edge stuff
                         Java requirements

                         • Fork/Join framework
                          • Java 7 [1] or
                          • Java 6 + the JSR166 jar [2]
                         Clojure requirements

                         • 1.5.0-* (this is still MASTER on github [3] as of 30/08/2012)


                                                                       [1] - http://jdk7.java.net/
                                                                       [2] - http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166.jar
                                                                       [3] - https://github.com/clojure/clojure
Thursday, 30 August 12
The Fork/Join Framework




Thursday, 30 August 12
The Fork/Join Framework

                         •Based on divide and conquer




Thursday, 30 August 12
The Fork/Join Framework

                         •Based on divide and conquer
                         •Work stealing algorithm




Thursday, 30 August 12
The Fork/Join Framework

                         •Based on divide and conquer
                         •Work stealing algorithm
                         •Uses deques - double ended queues.




Thursday, 30 August 12
The Fork/Join Framework

                         •Based on divide and conquer
                         •Work stealing algorithm
                         •Uses deques - double ended queues.
                         •Progressively divides the workload into tasks, up to a threshold




Thursday, 30 August 12
The Fork/Join Framework

                         •Based on divide and conquer
                         •Work stealing algorithm
                         •Uses deques - double ended queues.
                         •Progressively divides the workload into tasks, up to a threshold
                         •Once it finished one task, it pops another one form its deque




Thursday, 30 August 12
The Fork/Join Framework

                         •Based on divide and conquer
                         •Work stealing algorithm
                         •Uses deques - double ended queues.
                         •Progressively divides the workload into tasks, up to a threshold
                         •Once it finished one task, it pops another one form its deque
                         •After at least two tasks have finished, results can be combined/joined




Thursday, 30 August 12
The Fork/Join Framework

                         •Based on divide and conquer
                         •Work stealing algorithm
                         •Uses deques - double ended queues.
                         •Progressively divides the workload into tasks, up to a threshold
                         •Once it finished one task, it pops another one form its deque
                         •After at least two tasks have finished, results can be combined/joined
                         •Idle workers can pop tasks from the deques of workers which fall behind




Thursday, 30 August 12
Text is boring


Thursday, 30 August 12
Fork/Join algorithm - simplified view




Thursday, 30 August 12
Fork/Join algorithm - simplified view




   Workload is put in “deques”




Thursday, 30 August 12
Fork/Join algorithm - simplified view




                                                         ...and progressively halved




Thursday, 30 August 12
Fork/Join algorithm - simplified view




Thursday, 30 August 12
Fork/Join algorithm - simplified view




                         ...up to a configured threshold




Thursday, 30 August 12
Fork/Join algorithm - simplified view




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                         Combine




                                    Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                         Combine




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                         Combine                            Combine




                             Worker 1                   Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                                                           Combine




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                                 Combine




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                           Combine




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                             Worker 1                    Worker 2

                         Idle workers can “steal” items from other workers
Thursday, 30 August 12
Fork/Join algorithm - simplified view




                                             Combine Combine




                          Worker 1                     Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                                                        Combine




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                                                    Combine




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                                       Combine




                          Worker 1                    Worker 2


Thursday, 30 August 12
Fork/Join algorithm - simplified view




                                      Final result




                          Worker 1                    Worker 2


Thursday, 30 August 12
Let’s talk about Reducers




Thursday, 30 August 12
Let’s talk about Reducers

                         Motivations

                         • Performance
                          • via less allocation
                          • via parallelism (leverage Fork/Join)




Thursday, 30 August 12
Let’s talk about Reducers

                         Motivations                               Issues

                         • Performance                             • Lists and Seqs are sequential
                          • via less allocation                    • map / filter implies order
                          • via parallelism (leverage Fork/Join)




Thursday, 30 August 12
A closer look at what map does
                         ;; a naive map implementation
                         (defn map [f coll]
                           (if (seq coll)
                             (cons (f (first coll)) (map f (rest coll)))
                             '()))




Thursday, 30 August 12
A closer look at what map does
                             ;; a naive map implementation
                             (defn map [f coll]
                               (if (seq coll)
                                 (cons (f (first coll)) (map f (rest coll)))
                                 '()))


                         • Recursion




Thursday, 30 August 12
A closer look at what map does
                             ;; a naive map implementation
                             (defn map [f coll]
                               (if (seq coll)
                                 (cons (f (first coll)) (map f (rest coll)))
                                 '()))


                         • Recursion
                         • Order




Thursday, 30 August 12
A closer look at what map does
                              ;; a naive map implementation
                              (defn map [f coll]
                                (if (seq coll)
                                  (cons (f (first coll)) (map f (rest coll)))
                                  '()))


                         • Recursion
                         • Order
                         • Laziness (not shown)



Thursday, 30 August 12
A closer look at what map does
                              ;; a naive map implementation
                              (defn map [f coll]
                                (if (seq coll)
                                  (cons (f (first coll)) (map f (rest coll)))
                                  '()))


                         • Recursion
                         • Order
                         • Laziness (not shown)
                         • Consumes List


Thursday, 30 August 12
A closer look at what map does
                              ;; a naive map implementation
                              (defn map [f coll]
                                (if (seq coll)
                                  (cons (f (first coll)) (map f (rest coll)))
                                  '()))


                         • Recursion
                         • Order
                         • Laziness (not shown)
                         • Consumes List
                         • Builds List

Thursday, 30 August 12
A closer look at what map does
                              ;; a naive map implementation
                              (defn map [f coll]
                                (if (seq coll)
                                  (cons (f (first coll)) (map f (rest coll)))
                                  '()))


                         • Recursion
                         • Order                        Oh, and it also applies the function
                         • Laziness (not shown)         to each item before putting the result
                         • Consumes List                into the new list
                         • Builds List

Thursday, 30 August 12
A closer look at what map does
                              ;; a naive map implementation
                              (defn map [f coll]
                                (if (seq coll)
                                  (cons (f (first coll)) (map f (rest coll)))
                                  '()))
                                                           This is what mapping means!

                         • Recursion
                         • Order                          Oh, and it also applies the function
                         • Laziness (not shown)           to each item before putting the result
                         • Consumes List                  into the new list
                         • Builds List

Thursday, 30 August 12
Reduction Transformers




Thursday, 30 August 12
Reduction Transformers


                         • Idea is to build map / filter on top of reduce to break from sequentiality




Thursday, 30 August 12
Reduction Transformers


                         • Idea is to build map / filter on top of reduce to break from sequentiality
                         • map / filter then builds nothing and consumes nothing




Thursday, 30 August 12
Reduction Transformers


                         • Idea is to build map / filter on top of reduce to break from sequentiality
                         • map / filter then builds nothing and consumes nothing
                         • It changes what reduce means to the collection by transforming the reducing
                         functions




Thursday, 30 August 12
What map is really all about
                         (defn mapping [f]
                           (fn [f1]
                             (fn [result input]
                               (f1 result (f input)))))




Thursday, 30 August 12
But wait!
                         If map doesn’t consume the list any longer, who does?

                             • reduce does!
                             • Since Clojure 1.4 reduce lets the collection reduce itself
                              (through the CollReduce / CollFold protocols)
                              • Think of what this means for tree-like structures such as
                               vectors
                             • This is key to leveraging the Fork/Join framework




Thursday, 30 August 12
Now we can use mapping to create reducing functions
                               (reduce ((mapping inc) +) 0 [1 2 3 4])
                               ;; 14




Thursday, 30 August 12
Now we can use mapping to create reducing functions
                               (reduce ((mapping inc) +) 0 [1 2 3 4])
                               ;; 14




                                    (fn [result input]
                                      (+ result (inc input)))




Thursday, 30 August 12
Now we can use mapping to create reducing functions
                             (reduce ((mapping inc) conj) [] [1 2 3 4])
                             ;; [2 3 4 5]




Thursday, 30 August 12
Now we can use mapping to create reducing functions
                             (reduce ((mapping inc) conj) [] [1 2 3 4])
                             ;; [2 3 4 5]




                                    (fn [result input]
                                      (conj result (inc input)))




Thursday, 30 August 12
Now we can use mapping to create reducing functions
                             (reduce ((mapping inc) conj) [] [1 2 3 4])
                             ;; [2 3 4 5]




                                    (fn [result input]
                                      (conj result (inc input)))


                                  But it feels awkward to use it in this form

Thursday, 30 August 12
What do we have so far?


                         • Performance has been improved due to less allocations
                          • No intermediary lists need to be built (see Haskell’s StreamFusion [4])
                         • However reduce is still sequential




                                                                                        [4] - http://bit.ly/streamFusion
Thursday, 30 August 12
Enters fold




Thursday, 30 August 12
Enters fold

                         • Takes the sequentiality out or foldl, foldr and reduce




Thursday, 30 August 12
Enters fold

                         • Takes the sequentiality out or foldl, foldr and reduce
                         • Potentially parallel (fallsback to standard reduce otherwise)




Thursday, 30 August 12
Enters fold

                         • Takes the sequentiality out or foldl, foldr and reduce
                         • Potentially parallel (fallsback to standard reduce otherwise)
                         • Reduce/Combine strategy (think Fork/Join Framework)




Thursday, 30 August 12
Enters fold

                         • Takes the sequentiality out or foldl, foldr and reduce
                         • Potentially parallel (fallsback to standard reduce otherwise)
                         • Reduce/Combine strategy (think Fork/Join Framework)
                         • Segments the collection




Thursday, 30 August 12
Enters fold

                         • Takes the sequentiality out or foldl, foldr and reduce
                         • Potentially parallel (fallsback to standard reduce otherwise)
                         • Reduce/Combine strategy (think Fork/Join Framework)
                         • Segments the collection
                         • Runs multiple reduces in parallel




Thursday, 30 August 12
Enters fold

                         • Takes the sequentiality out or foldl, foldr and reduce
                         • Potentially parallel (fallsback to standard reduce otherwise)
                         • Reduce/Combine strategy (think Fork/Join Framework)
                         • Segments the collection
                         • Runs multiple reduces in parallel
                         • Uses a combining function to join/reduce results




Thursday, 30 August 12
Enters fold

                         • Takes the sequentiality out or foldl, foldr and reduce
                         • Potentially parallel (fallsback to standard reduce otherwise)
                         • Reduce/Combine strategy (think Fork/Join Framework)
                         • Segments the collection
                         • Runs multiple reduces in parallel
                         • Uses a combining function to join/reduce results


                                    (defn fold [combinef reducef coll]
                                      ...)


Thursday, 30 August 12
The combining function is a monoid
                         • A binary function with an identity element
                         • All the following functions are equivalent monoids




Thursday, 30 August 12
The combining function is a monoid
                         • A binary function with an identity element
                         • All the following functions are equivalent monoids

                                                      +
                                                      (+ 2 3) ; 5
                                                      (+) ; 0




Thursday, 30 August 12
The combining function is a monoid
                         • A binary function with an identity element
                         • All the following functions are equivalent monoids

                                                (defn my-+
                                                  ([] 0)
                                                  ([a b] (+ a b)))

                                                (my-+ 2 3) ; 5
                                                (my-+) ; 0




Thursday, 30 August 12
The combining function is a monoid
                         • A binary function with an identity element
                         • All the following functions are equivalent monoids

                                (require ‘[clojure.core.reducers :as r])

                                (def my-+
                                  (r/monoid + (fn [] 0)))

                                (my-+ 2 3) ; 5
                                (my-+) ; 0



Thursday, 30 August 12
fold by examples


                         ;; all examples assume the reducers library
                         is available as r
                         (ns reducers-playground.core
                           (:require [clojure.core.reducers :as r]))




Thursday, 30 August 12
fold by examples:
                         increment all even positive integers up to 10 million
                                         and add them all up




Thursday, 30 August 12
fold by examples:
                          increment all even positive integers up to 10 million
                                          and add them all up
                     ;; these were taken from Rich’s reducers talk




Thursday, 30 August 12
fold by examples:
                          increment all even positive integers up to 10 million
                                          and add them all up
                     ;; these were taken from Rich’s reducers talk
                     (def my-vector (into [] (range 10000000)))




Thursday, 30 August 12
fold by examples:
                          increment all even positive integers up to 10 million
                                          and add them all up
                     ;; these were taken from Rich’s reducers talk
                     (def my-vector (into [] (range 10000000)))

                     (time (reduce + (map inc (filter even? my-vector))))




Thursday, 30 August 12
fold by examples:
                          increment all even positive integers up to 10 million
                                          and add them all up
                     ;; these were taken from Rich’s reducers talk
                     (def my-vector (into [] (range 10000000)))

                     (time (reduce + (map inc (filter even? my-vector))))
                     ;; 500msecs




Thursday, 30 August 12
fold by examples:
                          increment all even positive integers up to 10 million
                                          and add them all up
                     ;; these were taken from Rich’s reducers talk
                     (def my-vector (into [] (range 10000000)))

                     (time (reduce + (map inc (filter even? my-vector))))
                     ;; 500msecs

                     (time (reduce + (r/map inc (r/filter even? my-vector))))




Thursday, 30 August 12
fold by examples:
                          increment all even positive integers up to 10 million
                                          and add them all up
                     ;; these were taken from Rich’s reducers talk
                     (def my-vector (into [] (range 10000000)))

                     (time (reduce + (map inc (filter even? my-vector))))
                     ;; 500msecs

                     (time (reduce + (r/map inc (r/filter even? my-vector))))
                     ;; 260msecs




Thursday, 30 August 12
fold by examples:
                          increment all even positive integers up to 10 million
                                          and add them all up
                     ;; these were taken from Rich’s reducers talk
                     (def my-vector (into [] (range 10000000)))

                     (time (reduce + (map inc (filter even? my-vector))))
                     ;; 500msecs

                     (time (reduce + (r/map inc (r/filter even? my-vector))))
                     ;; 260msecs

                     (time (r/fold + (r/map inc (r/filter even? my-vector))))


Thursday, 30 August 12
fold by examples:
                          increment all even positive integers up to 10 million
                                          and add them all up
                     ;; these were taken from Rich’s reducers talk
                     (def my-vector (into [] (range 10000000)))

                     (time (reduce + (map inc (filter even? my-vector))))
                     ;; 500msecs

                     (time (reduce + (r/map inc (r/filter even? my-vector))))
                     ;; 260msecs

                     (time (r/fold + (r/map inc (r/filter even? my-vector))))
                     ;; 130msecs

Thursday, 30 August 12
fold by examples:
                                    standard word count

                (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

                (defn count-words [text]
                  (reduce
                   (fn [memo word]
                      (assoc memo word (inc (get memo word 0))))
                   {}
                   (map #(.toLowerCase %) (into [] (re-seq #"w+" text)))))




Thursday, 30 August 12
fold by examples:
                                    standard word count

                (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

                (defn count-words [text]
                  (reduce
                   (fn [memo word]
                      (assoc memo word (inc (get memo word 0))))
                   {}
                   (map #(.toLowerCase %) (into [] (re-seq #"w+" text)))))

                (time (count-words wiki-dump)) ;; 45 secs


Thursday, 30 August 12
fold by examples:
                                     parallel word count

                (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

                (defn p-count-words [text]
                  (r/fold
                   (r/monoid (partial merge-with +) hash-map)
                   (fn [memo word]
                     (assoc memo word (inc (get memo word 0))))
                   (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text)))))




Thursday, 30 August 12
fold by examples:
                                     parallel word count

                (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

                (defn p-count-words [text]
                  (r/fold
                   (r/monoid (partial merge-with +) hash-map)        Combining fn
                   (fn [memo word]
                     (assoc memo word (inc (get memo word 0))))
                   (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text)))))




Thursday, 30 August 12
fold by examples:
                                     parallel word count

                (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB
                                                Will be called at the leaves to merge the
                (defn p-count-words [text]                partial computations
                  (r/fold
                   (r/monoid (partial merge-with +) hash-map)
                   (fn [memo word]
                     (assoc memo word (inc (get memo word 0))))
                   (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text)))))




Thursday, 30 August 12
fold by examples:
                                     parallel word count

                (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB
                                                       Will be called with no arguments to
                (defn p-count-words [text]                     provide a seed value
                  (r/fold
                   (r/monoid (partial merge-with +) hash-map)
                   (fn [memo word]
                     (assoc memo word (inc (get memo word 0))))
                   (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text)))))




Thursday, 30 August 12
fold by examples:
                                     parallel word count

                (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

                (defn p-count-words [text]
                  (r/fold
                   (r/monoid (partial merge-with +) hash-map)
                   (fn [memo word]
                     (assoc memo word (inc (get memo word 0))))
                   (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text)))))




Thursday, 30 August 12
fold by examples:
                                     parallel word count

                (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB

                (defn p-count-words [text]
                  (r/fold
                   (r/monoid (partial merge-with +) hash-map)
                   (fn [memo word]
                     (assoc memo word (inc (get memo word 0))))
                   (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text)))))

                (time (p-count-words wiki-dump)) ;; 30 secs


Thursday, 30 August 12
fold by examples:
                               Load 100k records into PostgreSQL



                  (def records
                    (into [] (line-seq
                               (BufferedReader. (FileReader. "dump.txt")))))




Thursday, 30 August 12
fold by examples:
                                    Load 100k records into PostgreSQL


                         (time (doseq [record records]
                           (let [tokens (clojure.string/split record #"t" )]
                                  (insert users/users
                                          (values {
                                                    :account-id (nth tokens 0)
                                                    ...
                                                    })))))




Thursday, 30 August 12
fold by examples:
                                      Load 100k records into PostgreSQL


                         (time (doseq [record records]
                           (let [tokens (clojure.string/split record #"t" )]
                                  (insert users/users
                                          (values {
                                                    :account-id (nth tokens 0)
                                                    ...
                                                    })))))



                         ;; 90 secs
Thursday, 30 August 12
fold by examples:
                         Load 100k records into PostgreSQL in parallel
(time (r/fold
       +
       (r/map (fn [record]
                (let [tokens (clojure.string/split record #"t" )]
                  (do (insert users/users
                              (values {
                                        :account-id (nth tokens 0)
                                        ...
                                        }))
                      1))) records)))



Thursday, 30 August 12
fold by examples:
                         Load 100k records into PostgreSQL in parallel
(time (r/fold
       +
       (r/map (fn [record]
                (let [tokens (clojure.string/split record #"t" )]
                  (do (insert users/users
                              (values {
                                        :account-id (nth tokens 0)
                                        ...
                                        }))
                      1))) records)))


;; 50 secs
Thursday, 30 August 12
When to use it




Thursday, 30 August 12
When to use it

                         • Exploring decision trees




Thursday, 30 August 12
When to use it

                         • Exploring decision trees
                         • Image processing




Thursday, 30 August 12
When to use it

                         • Exploring decision trees
                         • Image processing
                         • As a building block for bigger, distributed systems such as Datomic and
                          Cascalog (maybe around parallel agregators)




Thursday, 30 August 12
When to use it

                         • Exploring decision trees
                         • Image processing
                         • As a building block for bigger, distributed systems such as Datomic and
                          Cascalog (maybe around parallel agregators)
                         • Basically any list intensive program




Thursday, 30 August 12
When to use it

                         • Exploring decision trees
                         • Image processing
                         • As a building block for bigger, distributed systems such as Datomic and
                          Cascalog (maybe around parallel agregators)
                         • Basically any list intensive program


                                    But the tools are available to anyone so be creative!



Thursday, 30 August 12
Resources

                         • The Anatomy of a Reducer - http://bit.ly/anatomyReducers
                         • Rich’s announcement post on Reducers - http://bit.ly/reducersANN
                         • Rich Hickey - Reducers - EuroClojure 2012 - http://bit.ly/reducersVideo
                          (this presentation was heavily inspired by this video)
                         • The Source on github - http://bit.ly/reducersCore



                                                                                      Leonardo Borges
                                                                                      @leonardo_borges
                                                                                      http://www.leonardoborges.com
                                                                                      http://www.thoughtworks.com
Thursday, 30 August 12
Thanks!




                             Questions?



                                 Leonardo Borges
                                @leonardo_borges
                         http://www.leonardoborges.com
                          http://www.thoughtworks.com

Thursday, 30 August 12

More Related Content

PDF
dojo is bizarro jQuery
PDF
Continuation Passing Style and Macros in Clojure - Jan 2012
PDF
Clouds against the Floods (RubyConfBR2011)
PPTX
Introduction to Clojure and why it's hot for Sart-Ups
PPTX
Fork and join framework
PDF
Boost your-oop-with-fp
PDF
Introduction to MapReduce using Disco
PDF
A Java Fork_Join Framework
dojo is bizarro jQuery
Continuation Passing Style and Macros in Clojure - Jan 2012
Clouds against the Floods (RubyConfBR2011)
Introduction to Clojure and why it's hot for Sart-Ups
Fork and join framework
Boost your-oop-with-fp
Introduction to MapReduce using Disco
A Java Fork_Join Framework

Similar to Clojure Reducers / clj-syd Aug 2012 (20)

ODP
GPars: Groovy Parallelism for Java
KEY
Clojure Intro
PPT
Hadoop
PPTX
CILK/CILK++ and Reducers
PDF
Martin Odersky: What's next for Scala
PDF
Writing readable Clojure code
PPT
Slides chapters 28-32
PDF
Time Travel - Predicting the Future and Surviving a Parallel Universe - JDC2012
PDF
If You Think You Can Stay Away from Functional Programming, You Are Wrong
PPTX
MapReduce.pptx
PPTX
Solution Patterns for Parallel Programming
PDF
Dynamo concepts in depth (@pavlobaron)
PDF
Clojure 1.1 And Beyond
PDF
It's the end of design patterns as we know it (and i feel fine)
PPT
BayFP: Concurrent and Multicore Haskell
PDF
Clojure intro
PPTX
Clojure 7-Languages
PDF
Simplified Data Processing On Large Cluster
PDF
Hadoop & MapReduce
GPars: Groovy Parallelism for Java
Clojure Intro
Hadoop
CILK/CILK++ and Reducers
Martin Odersky: What's next for Scala
Writing readable Clojure code
Slides chapters 28-32
Time Travel - Predicting the Future and Surviving a Parallel Universe - JDC2012
If You Think You Can Stay Away from Functional Programming, You Are Wrong
MapReduce.pptx
Solution Patterns for Parallel Programming
Dynamo concepts in depth (@pavlobaron)
Clojure 1.1 And Beyond
It's the end of design patterns as we know it (and i feel fine)
BayFP: Concurrent and Multicore Haskell
Clojure intro
Clojure 7-Languages
Simplified Data Processing On Large Cluster
Hadoop & MapReduce
Ad

More from Leonardo Borges (19)

PDF
Realtime collaboration with Clojure - EuroClojure - Barcelona, 2015
PDF
Parametricity - #cljsyd - May, 2015
PDF
From Java to Parellel Clojure - Clojure South 2019
PDF
The algebra of library design
PDF
Futures e abstração - QCon São Paulo 2015
PDF
Functional Reactive Programming / Compositional Event Systems
PDF
High Performance web apps in Om, React and ClojureScript
PDF
Programação functional reativa: lidando com código assíncrono
PDF
Monads in Clojure
PDF
Clojure Macros Workshop: LambdaJam 2013 / CUFP 2013
PDF
Intro to Clojure's core.async
PDF
Functional Reactive Programming in Clojurescript
PDF
Clojure/West 2013 in 30 mins
PDF
The many facets of code reuse in JavaScript
PDF
Heroku addons development - Nov 2011
KEY
Clouds Against the Floods
KEY
Arel in Rails 3
PDF
Testing with Spring
PDF
JRuby in The Enterprise
Realtime collaboration with Clojure - EuroClojure - Barcelona, 2015
Parametricity - #cljsyd - May, 2015
From Java to Parellel Clojure - Clojure South 2019
The algebra of library design
Futures e abstração - QCon São Paulo 2015
Functional Reactive Programming / Compositional Event Systems
High Performance web apps in Om, React and ClojureScript
Programação functional reativa: lidando com código assíncrono
Monads in Clojure
Clojure Macros Workshop: LambdaJam 2013 / CUFP 2013
Intro to Clojure's core.async
Functional Reactive Programming in Clojurescript
Clojure/West 2013 in 30 mins
The many facets of code reuse in JavaScript
Heroku addons development - Nov 2011
Clouds Against the Floods
Arel in Rails 3
Testing with Spring
JRuby in The Enterprise
Ad

Recently uploaded (20)

PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPTX
Machine Learning_overview_presentation.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Tartificialntelligence_presentation.pptx
PDF
Getting Started with Data Integration: FME Form 101
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Encapsulation theory and applications.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PPTX
Spectroscopy.pptx food analysis technology
PPTX
A Presentation on Artificial Intelligence
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
OMC Textile Division Presentation 2021.pptx
PPTX
TLE Review Electricity (Electricity).pptx
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Machine Learning_overview_presentation.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Tartificialntelligence_presentation.pptx
Getting Started with Data Integration: FME Form 101
Spectral efficient network and resource selection model in 5G networks
Encapsulation theory and applications.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Spectroscopy.pptx food analysis technology
A Presentation on Artificial Intelligence
Heart disease approach using modified random forest and particle swarm optimi...
NewMind AI Weekly Chronicles - August'25-Week II
OMC Textile Division Presentation 2021.pptx
TLE Review Electricity (Electricity).pptx
A comparative analysis of optical character recognition models for extracting...
Assigned Numbers - 2025 - Bluetooth® Document
Network Security Unit 5.pdf for BCA BBA.
Build a system with the filesystem maintained by OSTree @ COSCUP 2025

Clojure Reducers / clj-syd Aug 2012

  • 1. Reducers A library and model for collection processing in Clojure Leonardo Borges @leonardo_borges http://www.leonardoborges.com http://www.thoughtworks.com Thursday, 30 August 12
  • 2. Reducers A library and model for collection processing in Clojure less or m i ns in 20 ... Leonardo Borges @leonardo_borges http://www.leonardoborges.com http://www.thoughtworks.com Thursday, 30 August 12
  • 3. Reducers huh? Here’s the gist Thursday, 30 August 12
  • 4. Reducers huh? Here’s the gist You get parallel versions of reduce, map and filter Thursday, 30 August 12
  • 5. Reducers huh? Here’s the gist You get parallel versions of reduce, map and filter Ta-da! I’m done! Thursday, 30 August 12
  • 6. Reducers huh? Here’s the gist You get parallel versions of reduce, map and filter Ta-da! I’m done! and well under my 20 min limit :) Thursday, 30 August 12
  • 7. Alright, alright I’m kidding Thursday, 30 August 12
  • 8. How do reducers make parallelism possible? Thursday, 30 August 12
  • 9. How do reducers make parallelism possible? • JVM’s Fork/Join framework • Reduction Transformers Thursday, 30 August 12
  • 10. Before we start - this is bleeding edge stuff Java requirements • Fork/Join framework • Java 7 [1] or • Java 6 + the JSR166 jar [2] Clojure requirements • 1.5.0-* (this is still MASTER on github [3] as of 30/08/2012) [1] - http://jdk7.java.net/ [2] - http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166.jar [3] - https://github.com/clojure/clojure Thursday, 30 August 12
  • 12. The Fork/Join Framework •Based on divide and conquer Thursday, 30 August 12
  • 13. The Fork/Join Framework •Based on divide and conquer •Work stealing algorithm Thursday, 30 August 12
  • 14. The Fork/Join Framework •Based on divide and conquer •Work stealing algorithm •Uses deques - double ended queues. Thursday, 30 August 12
  • 15. The Fork/Join Framework •Based on divide and conquer •Work stealing algorithm •Uses deques - double ended queues. •Progressively divides the workload into tasks, up to a threshold Thursday, 30 August 12
  • 16. The Fork/Join Framework •Based on divide and conquer •Work stealing algorithm •Uses deques - double ended queues. •Progressively divides the workload into tasks, up to a threshold •Once it finished one task, it pops another one form its deque Thursday, 30 August 12
  • 17. The Fork/Join Framework •Based on divide and conquer •Work stealing algorithm •Uses deques - double ended queues. •Progressively divides the workload into tasks, up to a threshold •Once it finished one task, it pops another one form its deque •After at least two tasks have finished, results can be combined/joined Thursday, 30 August 12
  • 18. The Fork/Join Framework •Based on divide and conquer •Work stealing algorithm •Uses deques - double ended queues. •Progressively divides the workload into tasks, up to a threshold •Once it finished one task, it pops another one form its deque •After at least two tasks have finished, results can be combined/joined •Idle workers can pop tasks from the deques of workers which fall behind Thursday, 30 August 12
  • 19. Text is boring Thursday, 30 August 12
  • 20. Fork/Join algorithm - simplified view Thursday, 30 August 12
  • 21. Fork/Join algorithm - simplified view Workload is put in “deques” Thursday, 30 August 12
  • 22. Fork/Join algorithm - simplified view ...and progressively halved Thursday, 30 August 12
  • 23. Fork/Join algorithm - simplified view Thursday, 30 August 12
  • 24. Fork/Join algorithm - simplified view ...up to a configured threshold Thursday, 30 August 12
  • 25. Fork/Join algorithm - simplified view Worker 1 Worker 2 Thursday, 30 August 12
  • 26. Fork/Join algorithm - simplified view Worker 1 Worker 2 Thursday, 30 August 12
  • 27. Fork/Join algorithm - simplified view Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 28. Fork/Join algorithm - simplified view Worker 1 Worker 2 Thursday, 30 August 12
  • 29. Fork/Join algorithm - simplified view Worker 1 Worker 2 Thursday, 30 August 12
  • 30. Fork/Join algorithm - simplified view Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 31. Fork/Join algorithm - simplified view Combine Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 32. Fork/Join algorithm - simplified view Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 33. Fork/Join algorithm - simplified view Worker 1 Worker 2 Thursday, 30 August 12
  • 34. Fork/Join algorithm - simplified view Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 35. Fork/Join algorithm - simplified view Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 36. Fork/Join algorithm - simplified view Worker 1 Worker 2 Thursday, 30 August 12
  • 37. Fork/Join algorithm - simplified view Worker 1 Worker 2 Idle workers can “steal” items from other workers Thursday, 30 August 12
  • 38. Fork/Join algorithm - simplified view Combine Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 39. Fork/Join algorithm - simplified view Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 40. Fork/Join algorithm - simplified view Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 41. Fork/Join algorithm - simplified view Combine Worker 1 Worker 2 Thursday, 30 August 12
  • 42. Fork/Join algorithm - simplified view Final result Worker 1 Worker 2 Thursday, 30 August 12
  • 43. Let’s talk about Reducers Thursday, 30 August 12
  • 44. Let’s talk about Reducers Motivations • Performance • via less allocation • via parallelism (leverage Fork/Join) Thursday, 30 August 12
  • 45. Let’s talk about Reducers Motivations Issues • Performance • Lists and Seqs are sequential • via less allocation • map / filter implies order • via parallelism (leverage Fork/Join) Thursday, 30 August 12
  • 46. A closer look at what map does ;; a naive map implementation (defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '())) Thursday, 30 August 12
  • 47. A closer look at what map does ;; a naive map implementation (defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '())) • Recursion Thursday, 30 August 12
  • 48. A closer look at what map does ;; a naive map implementation (defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '())) • Recursion • Order Thursday, 30 August 12
  • 49. A closer look at what map does ;; a naive map implementation (defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '())) • Recursion • Order • Laziness (not shown) Thursday, 30 August 12
  • 50. A closer look at what map does ;; a naive map implementation (defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '())) • Recursion • Order • Laziness (not shown) • Consumes List Thursday, 30 August 12
  • 51. A closer look at what map does ;; a naive map implementation (defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '())) • Recursion • Order • Laziness (not shown) • Consumes List • Builds List Thursday, 30 August 12
  • 52. A closer look at what map does ;; a naive map implementation (defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '())) • Recursion • Order Oh, and it also applies the function • Laziness (not shown) to each item before putting the result • Consumes List into the new list • Builds List Thursday, 30 August 12
  • 53. A closer look at what map does ;; a naive map implementation (defn map [f coll] (if (seq coll) (cons (f (first coll)) (map f (rest coll))) '())) This is what mapping means! • Recursion • Order Oh, and it also applies the function • Laziness (not shown) to each item before putting the result • Consumes List into the new list • Builds List Thursday, 30 August 12
  • 55. Reduction Transformers • Idea is to build map / filter on top of reduce to break from sequentiality Thursday, 30 August 12
  • 56. Reduction Transformers • Idea is to build map / filter on top of reduce to break from sequentiality • map / filter then builds nothing and consumes nothing Thursday, 30 August 12
  • 57. Reduction Transformers • Idea is to build map / filter on top of reduce to break from sequentiality • map / filter then builds nothing and consumes nothing • It changes what reduce means to the collection by transforming the reducing functions Thursday, 30 August 12
  • 58. What map is really all about (defn mapping [f] (fn [f1] (fn [result input] (f1 result (f input))))) Thursday, 30 August 12
  • 59. But wait! If map doesn’t consume the list any longer, who does? • reduce does! • Since Clojure 1.4 reduce lets the collection reduce itself (through the CollReduce / CollFold protocols) • Think of what this means for tree-like structures such as vectors • This is key to leveraging the Fork/Join framework Thursday, 30 August 12
  • 60. Now we can use mapping to create reducing functions (reduce ((mapping inc) +) 0 [1 2 3 4]) ;; 14 Thursday, 30 August 12
  • 61. Now we can use mapping to create reducing functions (reduce ((mapping inc) +) 0 [1 2 3 4]) ;; 14 (fn [result input] (+ result (inc input))) Thursday, 30 August 12
  • 62. Now we can use mapping to create reducing functions (reduce ((mapping inc) conj) [] [1 2 3 4]) ;; [2 3 4 5] Thursday, 30 August 12
  • 63. Now we can use mapping to create reducing functions (reduce ((mapping inc) conj) [] [1 2 3 4]) ;; [2 3 4 5] (fn [result input] (conj result (inc input))) Thursday, 30 August 12
  • 64. Now we can use mapping to create reducing functions (reduce ((mapping inc) conj) [] [1 2 3 4]) ;; [2 3 4 5] (fn [result input] (conj result (inc input))) But it feels awkward to use it in this form Thursday, 30 August 12
  • 65. What do we have so far? • Performance has been improved due to less allocations • No intermediary lists need to be built (see Haskell’s StreamFusion [4]) • However reduce is still sequential [4] - http://bit.ly/streamFusion Thursday, 30 August 12
  • 67. Enters fold • Takes the sequentiality out or foldl, foldr and reduce Thursday, 30 August 12
  • 68. Enters fold • Takes the sequentiality out or foldl, foldr and reduce • Potentially parallel (fallsback to standard reduce otherwise) Thursday, 30 August 12
  • 69. Enters fold • Takes the sequentiality out or foldl, foldr and reduce • Potentially parallel (fallsback to standard reduce otherwise) • Reduce/Combine strategy (think Fork/Join Framework) Thursday, 30 August 12
  • 70. Enters fold • Takes the sequentiality out or foldl, foldr and reduce • Potentially parallel (fallsback to standard reduce otherwise) • Reduce/Combine strategy (think Fork/Join Framework) • Segments the collection Thursday, 30 August 12
  • 71. Enters fold • Takes the sequentiality out or foldl, foldr and reduce • Potentially parallel (fallsback to standard reduce otherwise) • Reduce/Combine strategy (think Fork/Join Framework) • Segments the collection • Runs multiple reduces in parallel Thursday, 30 August 12
  • 72. Enters fold • Takes the sequentiality out or foldl, foldr and reduce • Potentially parallel (fallsback to standard reduce otherwise) • Reduce/Combine strategy (think Fork/Join Framework) • Segments the collection • Runs multiple reduces in parallel • Uses a combining function to join/reduce results Thursday, 30 August 12
  • 73. Enters fold • Takes the sequentiality out or foldl, foldr and reduce • Potentially parallel (fallsback to standard reduce otherwise) • Reduce/Combine strategy (think Fork/Join Framework) • Segments the collection • Runs multiple reduces in parallel • Uses a combining function to join/reduce results (defn fold [combinef reducef coll] ...) Thursday, 30 August 12
  • 74. The combining function is a monoid • A binary function with an identity element • All the following functions are equivalent monoids Thursday, 30 August 12
  • 75. The combining function is a monoid • A binary function with an identity element • All the following functions are equivalent monoids + (+ 2 3) ; 5 (+) ; 0 Thursday, 30 August 12
  • 76. The combining function is a monoid • A binary function with an identity element • All the following functions are equivalent monoids (defn my-+ ([] 0) ([a b] (+ a b))) (my-+ 2 3) ; 5 (my-+) ; 0 Thursday, 30 August 12
  • 77. The combining function is a monoid • A binary function with an identity element • All the following functions are equivalent monoids (require ‘[clojure.core.reducers :as r]) (def my-+ (r/monoid + (fn [] 0))) (my-+ 2 3) ; 5 (my-+) ; 0 Thursday, 30 August 12
  • 78. fold by examples ;; all examples assume the reducers library is available as r (ns reducers-playground.core (:require [clojure.core.reducers :as r])) Thursday, 30 August 12
  • 79. fold by examples: increment all even positive integers up to 10 million and add them all up Thursday, 30 August 12
  • 80. fold by examples: increment all even positive integers up to 10 million and add them all up ;; these were taken from Rich’s reducers talk Thursday, 30 August 12
  • 81. fold by examples: increment all even positive integers up to 10 million and add them all up ;; these were taken from Rich’s reducers talk (def my-vector (into [] (range 10000000))) Thursday, 30 August 12
  • 82. fold by examples: increment all even positive integers up to 10 million and add them all up ;; these were taken from Rich’s reducers talk (def my-vector (into [] (range 10000000))) (time (reduce + (map inc (filter even? my-vector)))) Thursday, 30 August 12
  • 83. fold by examples: increment all even positive integers up to 10 million and add them all up ;; these were taken from Rich’s reducers talk (def my-vector (into [] (range 10000000))) (time (reduce + (map inc (filter even? my-vector)))) ;; 500msecs Thursday, 30 August 12
  • 84. fold by examples: increment all even positive integers up to 10 million and add them all up ;; these were taken from Rich’s reducers talk (def my-vector (into [] (range 10000000))) (time (reduce + (map inc (filter even? my-vector)))) ;; 500msecs (time (reduce + (r/map inc (r/filter even? my-vector)))) Thursday, 30 August 12
  • 85. fold by examples: increment all even positive integers up to 10 million and add them all up ;; these were taken from Rich’s reducers talk (def my-vector (into [] (range 10000000))) (time (reduce + (map inc (filter even? my-vector)))) ;; 500msecs (time (reduce + (r/map inc (r/filter even? my-vector)))) ;; 260msecs Thursday, 30 August 12
  • 86. fold by examples: increment all even positive integers up to 10 million and add them all up ;; these were taken from Rich’s reducers talk (def my-vector (into [] (range 10000000))) (time (reduce + (map inc (filter even? my-vector)))) ;; 500msecs (time (reduce + (r/map inc (r/filter even? my-vector)))) ;; 260msecs (time (r/fold + (r/map inc (r/filter even? my-vector)))) Thursday, 30 August 12
  • 87. fold by examples: increment all even positive integers up to 10 million and add them all up ;; these were taken from Rich’s reducers talk (def my-vector (into [] (range 10000000))) (time (reduce + (map inc (filter even? my-vector)))) ;; 500msecs (time (reduce + (r/map inc (r/filter even? my-vector)))) ;; 260msecs (time (r/fold + (r/map inc (r/filter even? my-vector)))) ;; 130msecs Thursday, 30 August 12
  • 88. fold by examples: standard word count (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB (defn count-words [text] (reduce (fn [memo word] (assoc memo word (inc (get memo word 0)))) {} (map #(.toLowerCase %) (into [] (re-seq #"w+" text))))) Thursday, 30 August 12
  • 89. fold by examples: standard word count (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB (defn count-words [text] (reduce (fn [memo word] (assoc memo word (inc (get memo word 0)))) {} (map #(.toLowerCase %) (into [] (re-seq #"w+" text))))) (time (count-words wiki-dump)) ;; 45 secs Thursday, 30 August 12
  • 90. fold by examples: parallel word count (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB (defn p-count-words [text] (r/fold (r/monoid (partial merge-with +) hash-map) (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text))))) Thursday, 30 August 12
  • 91. fold by examples: parallel word count (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB (defn p-count-words [text] (r/fold (r/monoid (partial merge-with +) hash-map) Combining fn (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text))))) Thursday, 30 August 12
  • 92. fold by examples: parallel word count (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB Will be called at the leaves to merge the (defn p-count-words [text] partial computations (r/fold (r/monoid (partial merge-with +) hash-map) (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text))))) Thursday, 30 August 12
  • 93. fold by examples: parallel word count (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB Will be called with no arguments to (defn p-count-words [text] provide a seed value (r/fold (r/monoid (partial merge-with +) hash-map) (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text))))) Thursday, 30 August 12
  • 94. fold by examples: parallel word count (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB (defn p-count-words [text] (r/fold (r/monoid (partial merge-with +) hash-map) (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text))))) Thursday, 30 August 12
  • 95. fold by examples: parallel word count (def wiki-dump (slurp "subset-wiki-dump50")) ;50 MB (defn p-count-words [text] (r/fold (r/monoid (partial merge-with +) hash-map) (fn [memo word] (assoc memo word (inc (get memo word 0)))) (r/map #(.toLowerCase %) (into [] (re-seq #"w+" text))))) (time (p-count-words wiki-dump)) ;; 30 secs Thursday, 30 August 12
  • 96. fold by examples: Load 100k records into PostgreSQL (def records (into [] (line-seq (BufferedReader. (FileReader. "dump.txt"))))) Thursday, 30 August 12
  • 97. fold by examples: Load 100k records into PostgreSQL (time (doseq [record records] (let [tokens (clojure.string/split record #"t" )] (insert users/users (values { :account-id (nth tokens 0) ... }))))) Thursday, 30 August 12
  • 98. fold by examples: Load 100k records into PostgreSQL (time (doseq [record records] (let [tokens (clojure.string/split record #"t" )] (insert users/users (values { :account-id (nth tokens 0) ... }))))) ;; 90 secs Thursday, 30 August 12
  • 99. fold by examples: Load 100k records into PostgreSQL in parallel (time (r/fold + (r/map (fn [record] (let [tokens (clojure.string/split record #"t" )] (do (insert users/users (values { :account-id (nth tokens 0) ... })) 1))) records))) Thursday, 30 August 12
  • 100. fold by examples: Load 100k records into PostgreSQL in parallel (time (r/fold + (r/map (fn [record] (let [tokens (clojure.string/split record #"t" )] (do (insert users/users (values { :account-id (nth tokens 0) ... })) 1))) records))) ;; 50 secs Thursday, 30 August 12
  • 101. When to use it Thursday, 30 August 12
  • 102. When to use it • Exploring decision trees Thursday, 30 August 12
  • 103. When to use it • Exploring decision trees • Image processing Thursday, 30 August 12
  • 104. When to use it • Exploring decision trees • Image processing • As a building block for bigger, distributed systems such as Datomic and Cascalog (maybe around parallel agregators) Thursday, 30 August 12
  • 105. When to use it • Exploring decision trees • Image processing • As a building block for bigger, distributed systems such as Datomic and Cascalog (maybe around parallel agregators) • Basically any list intensive program Thursday, 30 August 12
  • 106. When to use it • Exploring decision trees • Image processing • As a building block for bigger, distributed systems such as Datomic and Cascalog (maybe around parallel agregators) • Basically any list intensive program But the tools are available to anyone so be creative! Thursday, 30 August 12
  • 107. Resources • The Anatomy of a Reducer - http://bit.ly/anatomyReducers • Rich’s announcement post on Reducers - http://bit.ly/reducersANN • Rich Hickey - Reducers - EuroClojure 2012 - http://bit.ly/reducersVideo (this presentation was heavily inspired by this video) • The Source on github - http://bit.ly/reducersCore Leonardo Borges @leonardo_borges http://www.leonardoborges.com http://www.thoughtworks.com Thursday, 30 August 12
  • 108. Thanks! Questions? Leonardo Borges @leonardo_borges http://www.leonardoborges.com http://www.thoughtworks.com Thursday, 30 August 12