[Jfokus] Riding the Jet Streams

Post on 12-Apr-2017

65 views 0 download

Transcript of [Jfokus] Riding the Jet Streams

RidingJet Streams

@gAmUssA @hazelcast #jfokus #hazelcastjet

http://bit.ly/streams_jfokus2017

@gAmUssA @hazelcast #jfokus #hazelcastjet

Solutions Architect @Hazelcast

Developer Advocate @Hazelcast

@gamussa in internetz

Please, follow me on TwitterI’m very interesting ©

> whoami

@gAmUssA @hazelcast #jfokus #hazelcastjet

AgendaQuick refresh on Java 8 StreamsDistribute and Conquer Distributed DataDistributed StreamsHow we did all this

@gAmUssA @hazelcast #jfokus #hazelcastjet

Example: Word Count

Map<Integer, String> where keys are line numbers and values are lines.

Find how many times each word occurs

@gAmUssA @hazelcast #jfokus #hazelcastjet

What needs to be done?

Iterate through all the linesSplit the line into wordsUpdate running total of counts with new word

fillMapWithData("war_and_peace_eng.txt", source);

for (String line : source.values()) {for (String word : PATTERN.split(line)) {if (word.length() >= 5)count.compute(

cleanWord(word).toLowerCase(),(w, c) -> c == null ? 1 : c + 1

);}

}System.out.println(count.get("andrew"));

Iterate through all the lines

fillMapWithData("war_and_peace_eng.txt", source);

for (String line : source.values()) {for (String word : PATTERN.split(line)) {if (word.length() >= 5)count.compute(

cleanWord(word).toLowerCase(),(w, c) -> c == null ? 1 : c + 1

);}

}System.out.println(count.get("andrew"));

Split the line into words

fillMapWithData("war_and_peace_eng.txt", source);

for (String line : source.values()) {for (String word : PATTERN.split(line)) {if (word.length() >= 5)count.compute(

cleanWord(word).toLowerCase(),(w, c) -> c == null ? 1 : c + 1

);}

}System.out.println(count.get("andrew"));

Update running total of counts with new word

fillMapWithData("war_and_peace_eng.txt", source);

for (String line : source.values()) {for (String word : PATTERN.split(line)) {if (word.length() >= 5)count.compute(

cleanWord(word).toLowerCase(),(w, c) -> c == null ? 1 : c + 1

);}

}System.out.println(count.get("andrew"));

Print the result

java.util.stream

@gAmUssA @hazelcast #jfokus #hazelcastjet

Java 8 Streams…An abstraction represents a sequence of elementsIs not a data structure Convey elements from a source through a pipeline of operationsOperation doesn’t modify a source

@gAmUssA @hazelcast #jfokus #hazelcastjet

Why I should care about Stream API?

You’re Java developer

What does regular Java developer think about Scala?

advanced

@gAmUssA @hazelcast #jfokus #hazelcastjet

Why I should care about Stream API?

You’re Java developerMany Java developers know JavaIt’s all about data processing

@gAmUssA @hazelcast #jfokus #hazelcastjet

java.util.stream

map(), flatMap(), filter()

reduce(), collect()

sorted(), distinct()

Intermediate operation

Terminal operation

Stateful Intermediate (Blocking) operation

@gAmUssA @hazelcast #jfokus #hazelcastjet

Why would one need a cluster?

One does not simply fit all Big Data in one machine

@gAmUssA @hazelcast #jfokus #hazelcastjet

Problem

Data doesn’t fit just one machine

@gAmUssA @hazelcast #jfokus #hazelcastjet

Why would one need a cluster?

One does not simply put all Big Datain one machineData is too important to have it only one machine

Replication on Sharding?

http://book.mixu.net/distsys/single-page.html

@gAmUssA @hazelcast #jfokus #hazelcastjet

Another RequirementsEasy to use Simple APIEmbeddableCloud Native

@gAmUssA @hazelcast #jfokus #hazelcastjet

What’s Hazelcast IMDG?

In-memory Data Grid

Apache v2 Licensed

Distributed Caches (IMap, JCache)

Java Collections (IList, ISet, IQueue)

Messaging (Topic, RingBuffer)

Computation (ExecutorService, M-R)

1 900 Stars On GitHub

100%Open Source

134 contributors

@gAmUssA @hazelcast #jfokus #hazelcastjet

GreenPrimary

GreenBackup

GreenShard

@gAmUssA @hazelcast #jfokus #hazelcastjet

What’s the problem?

Use IMap.values().stream() ?Or IMap.entrySet().stream() ?

33

@gAmUssA @hazelcast #jfokus #hazelcastjet

Problem

Data doesn’t fit just one machine

@gAmUssA @hazelcast #jfokus #hazelcastjet

EASY (actually, not)!

Implement serializable version of the interfacesIntroducing DistributedStream

37

38

40

Jet Streams

jet.hazelcast.org

@gAmUssA @hazelcast #jfokus #hazelcastjet

What’s Hazelcast Jet?General purpose distributed data processing frameworkBased on Direct Acyclic Graph to model data flowBuilt on top of Hazelcast IMDGComparable to Apache Spark or Apache Flink

42

@gAmUssA @hazelcast #jfokus #hazelcastjet

DAG

vertex vertex

vertex vertex

SOURCE

SINK

BenchmarksCompared to Spark, Flink, Hadoop doing word count, running on a

cluster of 9 nodes, 40 cores each

@gAmUssA @hazelcast #jfokus #hazelcastjet

Future (It’s bright!)

Processing guarantees for stream processingStreaming features (windowing, triggering)Higher level streaming and batching APIsIntegration with additional Hazelcaststructures (ICache, IQueue ..)

@gAmUssA @hazelcast #jfokus #hazelcastjet

Future (It’s bright!)

Event sourcing / CQRSOff-heap memory supportRxJavaMore connectors to additional sources (JMS, JDBC..)

@gAmUssA @hazelcast #jfokus #hazelcastjet

Grab while it’s hot!

jet.hazelcast.org

hazelcast/hazelcast-jet

http://bit.ly/streams_jfokus2017

documentation

Source on Github

Presentation materials

@gAmUssA @hazelcast #jfokus #hazelcastjet

ConclusionJava Stream API provides very white range of data processing toolsWar And Piece – is a Big (a lot of data) Book!Now we’re pretty sure that Andrew and Pierre are the main characters

SlidesCarnival icons are editable shapes.

This means that you can:● Resize them without losing quality.● Change fill color and opacity.

Isn’t that nice? :)

Examples:

Now you can use any emoji as an icon!And of course it resizes without losing quality and you can change the color.

How? Follow Google instructions https://twitter.com/googledocs/status/730087240156643328

✋👆👉👍👤👦👧👨👩👪💃🏃💑❤😂

😉😋😒😭👶😸🐟🍒🍔💣📌📖🔨🎃🎈

🎨🏈🏰🌏🔌🔑 and many more...

😉

@gAmUssA @hazelcast #jfokus #hazelcastjet

Extra graphics