Procesado concurrente de datos con ParallelStreams - David Gomez
-
Upload
oracle-espana -
Category
Technology
-
view
112 -
download
0
Transcript of Procesado concurrente de datos con ParallelStreams - David Gomez
ParallelStreamsConcurrent data processing in Java 8David Gómez G.@[email protected]
Do you remember?
use stream()
for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, Thread.activeCount());}
4999299 elements computed in 225 msecs with 9 threads 4999299 elements computed in 230 msecs with 9 threads 4999299 elements computed in 250 msecs with 9 threads
@dgomezg
A Stream is…An convenience method to iterate over
collections in a declarative wayList<Integer> numbers = new ArrayList<Integer>();for (int i= 0; i < 100 ; i++) { numbers.add(i); }
List<Integer> evenNumbers = numbers.stream() .filter(n -> n % 2 == 0) .collect(toList());
@dgomezg
Anatomy of a Stream
Source
Intermediate Operations
filter
map
order
function
Final operation
pipe
line
@dgomezg
Iterating a Stream
List<Integer> evenNumbers = numbers.stream() .filter(n -> n % 2 == 0) .collect(toList());
Internal Iteration - No manual Iterators handling - Concise - Fluent API: chain sequence processing Elements computed only when needed
@dgomezg
Iterating a Stream
List<Integer> evenNumbers = numbers.parallelStream() .filter(n -> n % 2 == 0) .collect(toList());
Easily Parallelism - Concurrency is hard to be done right! - Uses ForkJoin - Process steps should be - stateless - independent
@dgomezg
Parallel Streams
use stream()
List<Integer> numbers = new ArrayList<>();for (int i= 0; i < 10_000_000 ; i++) { numbers.add((int)Math.round(Math.random()*100));}
//This will use just a single thread Stream<Integer> evenNumbers = numbers.stream();
or parallelStream()//Automatically select the optimum number of threads Stream<Integer> evenNumbers = numbers.parallelStream();
@dgomezg
Let’s test it
use stream()
for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.stream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, Thread.activeCount());}
5001983 elements computed in 828 msecs with 2 threads 5001983 elements computed in 843 msecs with 2 threads 5001983 elements computed in 675 msecs with 2 threads 5001983 elements computed in 795 msecs with 2 threads
@dgomezg
Going parallel
use stream()
for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, Thread.activeCount());}
4999299 elements computed in 225 msecs with 9 threads 4999299 elements computed in 230 msecs with 9 threads 4999299 elements computed in 250 msecs with 9 threads
@dgomezg
Previously on…
http://www.slideshare.net/dgomezg/streams-en-java-8
Fork/Join Framework
Proposed by Doug Lea
"a style of parallel programming in which problems are solved by (recursively) splitting them into subtasks that are solved in parallel."
Available in Java 7
Used by ParallelStreams
The F/J algorithm
Result solve(Problem problem) { if (problem is small) directly solve problem else { split problem into independent parts fork new subtasks to solve each part join all subtasks compose result from subresults } }
as proposed by Doug Lea
ForkJoinPool
ExecutorService implementation that • has a defined number of Workers (threads) • executes ForkJoinTasks • submitted by execute(ForkJoinTask task)
• or by invoke(ForkJoinTask task)
ForkJoinTask
Abstract class that represents a task to be run concurrently
Every ForkJoinTask could be splitted (if not small enough) and solved Recursively
Two concrete implementations • RecursiveAction if not returning value • RecursiveTask if returning a value
ForkJoinWorkerThread
Any of the threads created by the ForkJoinPool
Executes ForkJoinTasks
Everyone has a Dequeue for tasks (allows task stealing)
ForkJoinWorkerThread
Result solve(Problem problem) { if (problem is small) directly solve problem else { split problem into independent parts fork new subtasks to solve each part join all subtasks compose result from subresults } }
the F/J algorithm
plus Task Stealing.
Fork/Join. When to use?
For computations that could be splitted into smaller tasks aka ‘divide and conquer’ algorithms Independent
Reduction with no contention.
ParallellStreams
for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, Thread.activeCount());}
4999299 elements computed in 225 msecs with 9 threads 4999299 elements computed in 230 msecs with 9 threads 4999299 elements computed in 250 msecs with 9 threads
Thread.activeCount not accurate
for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, Thread.activeCount());}
Thread.activeCount() does not show the effective number of threads processing the stream
Better count threads involvedSet<String> workerThreadNames = new ConcurrentSet<>();
for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.stream() .filter(n -> n % 2 == 0) .peek(n -> workerThreadNames.add( Thread.currentThread().getName())) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, workerThreadNames.size()); }
Threads usage
ParallelStreams use the common ForkJoinPool
Number of worker threads configured with -‐Djava.util.concurrent.ForkJoinPool.common.parallelism=n
Useful to keep CPU parallelism under control…
…but …
Limiting parallelism
for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.stream() .filter(n -> n % 2 == 0) .peek(n -> workerThreadNames.add( Thread.currentThread().getName())) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, workerThreadNames.size()); }
-‐Djava.util.concurrent.ForkJoinPool.common.parallelism=4
5001069 elements computed in 269 msecs with 5 threads
WTF
Limiting parallelismfor (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.stream() .filter(n -> n % 2 == 0) .peek(n -> workerThreadNames.add( Thread.currentThread().getName())) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, workerThreadNames.size()); } System.out.println("credits to threads: “ + workerThreadNames);
5001069 elements computed in 269 msecs with 5 threads credits to threads: ForkJoinPool.commonPool-worker-0, ForkJoinPool.commonPool-worker-1, ForkJoinPool.commonPool-worker-2, ForkJoinPool.commonPool-worker-3, main
WTF
Threads Involved in ParallelStream
ParallelStreams use the common ForkJoinPool
Thread invoking ParallelStream also used as Worker
Caveats: •ParallelStream processing is synchronous for invoking thread
•Other Threads using common ForkJoinPool could be affected
ParallelStream Hack
ParallelStream can be forced to use a custom ForkJoinPoolForkJoinPool forkJoinPool = new ForkJoinPool(4);long start = System.currentTimeMillis();
numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList());
ParallelStream Hack
ParallelStream can be forced to use a custom ForkJoinPoolForkJoinPool forkJoinPool = new ForkJoinPool(4);long start = System.currentTimeMillis();ForkJoinTask<List<Integer>> task = forkJoinPool.submit(() -> { return numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); } ); List<Integer> even = task.get();
ParallelStream HackParallelStream can be forced to use a custom ForkJoinPoolForkJoinPool forkJoinPool = new ForkJoinPool(4);ForkJoinTask<List<Integer>> task = forkJoinPool.submit(() -> { return numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); } ); List<Integer> even = task.get();
Task submitted in 1 msecs 5000805 elements computed in 328 msecs with 4 threads
ParallelStream Hack benefits
A custom ExecutorService • Does not affect other ParallelStreams • Does not affect Common ForkJoinPool users • Reduces unpredictable latency due to other CommonForkJoin Pool load
• Invoking thread not used as worker (async parallel process)
Blocking for IO
If firsts URLs stuck on a ConnectionTimeOut, overall performance could be affected Stream<String> urls = Files.lines(Paths.get("urlsToCheck.txt"));List<String> errors = urls.parallel().filter(url -> { //Connect to URL and wait for 200 response or timeout return true; }).collect(toList());
Nested parallelStreams
Outer parallelStream could exhaust ForkJoin Workers: long start = System.currentTimeMillis();IntStream.range(0, 10_000).parallel() .forEach(i -> { results[i][0] = (int) Math.round(Math.random() * 100); IntStream.range(1, 9_999) .parallel().forEach((int j) -> results[i][j] = (int) Math.round(Math.random() * 1000));});
Process finalized in 22974 msecs Process finalized in 22575 msecs Process finalized in 22606 msecs
Nested parallelStreams
Outer parallelStream could exhaust ForkJoin Workers: long start = System.currentTimeMillis();IntStream.range(0, 10_000).parallel() .forEach(i -> { results[i][0] = (int) Math.round(Math.random() * 100); IntStream.range(1, 9_999) .sequential().forEach((int j) -> results[i][j] = (int) Math.round(Math.random() * 1000));});
Process finalized in 12491 msecs Process finalized in 12589 msecs Process finalized in 12798 msecs
Too much Auto(un)boxing
outboxing and boxing of Integers in every filter call
List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList());
4999464 elements computed in 290 msecs with 8 threads 4999464 elements computed in 276 msecs with 8 threads 4999464 elements computed in 257 msecs with 8 threads 4999464 elements computed in 265 msecs with 8 threads
Less Auto(un)boxing
outboxing and boxing of Integers in every filter call
List<Integer> even = numbers.parallelStream() .mapToInt(n -> n) .filter(n -> n % 2 == 0) .sorted() .boxed() .collect(toList());
4999460 elements computed in 160 msecs with 8 threads 4999460 elements computed in 243 msecs with 8 threads 4999460 elements computed in 144 msecs with 8 threads 4999460 elements computed in 140 msecs with 8 threads
Conclusions
ParallelStreams eases concurrent processing but: • Understand how it works • Don’t abuse the default common ForkJoinPool
• Don’t use when blocking by IO • Or use a custom ForkJoinPool
• Avoid unnecessary autoboxing • Don’t add contention or synchronisation • Be careful with nested parallel streams • Use method references when sorting