Spark Engine - ut · 3/29 Artjom Lind ([email protected]) 22.10.2013 Motivation … which is...

1/29

Artjom Lind ([email protected])22.10.2013https://ds.cs.ut.ee

Hajussüsteemide UurimislaborDistributed Sytems Research Lab

Spark EngineIn-Memory Hadoop Cloud Computing (speaker: Artjom Lind)

(slides: adopted from http://spark.incubator.apache.org)

mailto:[email protected]

https://ds.cs.ut.ee/

2/29


Motivation● Hadoop programming is based on stortage-to-strorage data

flow (acyclic data flow).● Advantage: automatic decision where to run tasks and

recovery from failures

Input Data

Map

Map

MapMapMapReduce

MapReduce

OutputData



3/29


Motivation

● … which is becomes an I/O overhead in case of applications that are reusing some part of input data:

– Iterative algorithms (machine learning, graph algorithms)– Interactive data mining tools, like Matlab, R or Python based

● All of above do the same (from Hadoop perspective):– Reload the data from storage on each query



4/29


Iterative algorithms

● Using the common data source:

● Bypassing result to next iteration by offloading intermediate results to a storage:

InputData

Iter1

Iter2

Iter3 OutputData

Iter1 Iter2



5/29


Make advantage of RAM

● Keeping the “currently in process” set in Distributed RAM:

● Bypassing intermediate results over Distributed RAM:

InputData

Iter1

Iter2

Iter3

DRAM

Iter1 Iter2DRAM DRAM



6/29


Distributed RAM Features

● It has to be:– Fault-tolerant– Efficient in large commodity clusters

● Lots of nodes with local RAM● How to aggregate it to a Distributed RAM

● How the API can provide these features ?



7/29


Distributed RAM Features

● Existing distributed storage abstractions offer an interface based on fine-grained updates

– Reads and writes to cells in a table– E.g. key-value stores, databases, distributed

memory● Requires replicating data or update logs

across nodes for fault tolerance– Which is expensive for the data-

intensive applications



8/29


RDD Resilent Distributed Datasets

● Offer an interface based on coarse-grained transformations (e.g. map, group-by, join)

● Allows for efficient fault recovery using lineage

– Log one operation to apply to many elements– Recompute lost partitions of dataset on

failure– No cost if nothing fails



9/29


Recovery

● In case of common data source:

– … we go to storage directly if RDD failed● In case of sequentional bybass:

● … nothing to do we have to recompute everything

Iter1

Iter2

Iter3

DRAM




10/29


Recovery

● In case of common data source:

– … we go to storage directly if RDD failed● In case of sequentional bybass:

● … nothing to do we have to recompute everything

Iter1

Iter2

Iter3

DRAM


However we win in performanceif nothing failed



11/29


Generics of RDD

● Despite coarse-grained interface, RDDs can express surprisingly many parallel algorithms

– These naturally apply the same operation to many items

● Unify many current programming models– Data flow models: MapReduce, Dryad, SQL, …– Specialized tools for iterative apps: BSP (Pregel),

iterative MapReduce (Twister), incremental (CBP)● Also support new apps that these models don’t



12/29


Spark API

● Language-integrated API in Scala● Can be used interactively from Scala

interpreter● Provides:

– Resilient distributed datasets (RDDs)– Transformations to build RDDs (map, join,

filter, …)– Actions on RDDs (return a result to the

program or write data to storage: reduce, count, save, …)



13/29


Example: Log Mining

Load error messages from a log into memory, then interactively search for various patterns

lines = spark.textFile(“hdfs://...”)errors = lines.filter(_.startsWith(“ERROR”))messages = errors.map(_.split(‘\t’)(2))messages.persist()

messages.filter(_.contains(“foo”)).count

messages.filter(_.contains(“bar”)).count

DRAMDRAM

DRAM

DRAM

Block1

DRAMDRAM

Source

Block2Block3

Driver Worker

WorkerWorker



14/29


Fault tolerance

RDDs preserves lineage information that can be used to reconstruct lost partitions

errors = lines.filter(_.startsWith(“ERROR”)).map(_.split(‘\t’)(2))

HDFS File Filtered RDD Mapped RDD

_.startsWith(“ERROR”) _.split(‘\t’)(2)



15/29


Example: Regression

Goal: find best line separating two sets of points

+

–

++

+

+

+

++ +

– ––

–

–

–

––

+

target

–

random initial line



16/29


Code

val data = spark.textFile(...).map(readPoint).persist()

var w = Vector.random(D)

for (i <- 1 to ITERATIONS) { val gradient = data.map(p => (1 / (1 + exp(-p.y*(w dot p.x))) - 1) * p.y * p.x ).reduce((a,b) => a+b) w -= gradient}

println("Final w: " + w)



17/29


Performance



18/29


RDD in details

• RDDs additionally provide:• Control over partitioning (e.g. hash-based),

which can be used to optimize data placement across queries

• Control over persistence (e.g. store on disk vs in RAM)

• Fine-grained reads (treat RDD as a big table)



19/29


RDD vs. Distributed Shared Memory

Concern RDDs Distr. Shared Mem.

Reads Fine-grained Fine-grained

Writes Bulk transformations Fine-grained

Consistency Trivial (immutable) Up to app / runtime

Fault recovery Fine-grained and low-overhead using lineage

Requires checkpoints and program rollback

Straggler mitigation

Possible using speculative execution

Difficult

Work placement

Automatic based on data locality

Up to app (but runtime aims for transparency)



20/29


Spark Applications

● EM alg. for traffic prediction (Mobile Millennium)

● In-memory OLAP & anomaly detection (Conviva)

● Twitter spam classification (Monarch)● Alternating least squares matrix

factorization● Pregel on Spark (Bagel)● SQL on Spark (Shark)



21/29


Mobile Millenium App.

Estimate city traffic using position observations from “probe vehicles” (e.g. GPS-carrying taxis)



22/29


Implementation

Expectation maximization (EM) algorithm using iterated map and groupByKey on the same data3x faster with in-memory RDDs



23/29


Convia GeoReport

Aggregations on many keys w/ same WHERE clause40× gain comes from:Not re-reading unused columns or filtered recordsAvoiding repeated decompressionIn-memory storage of de-serialized objects

Spark

Hive

0 2 4 6 8 10 12 14 16 18 20

0.5

20



24/29


Implementation

Spark runs on the Mesos cluster manager [NSDI 11], letting it share resources with Hadoop & other appsCan read from any Hadoop input source (HDFS, S3, …)

SparkHadoo

pMPI

Mesos

Node Node Node Node

…

~10,000 lines of code, no changes to Scala



25/29


RDD Represenatation

Simple common interface: Set of partitions

Preferred locations for each partition List of parent RDDs Function to compute a partition given parents Optional partitioning info

Allows capturing wide range of transformationsUsers can easily add new transformations



26/29


Language integration

● Scala closures are Serializable Java objects● Serialize on driver, load & run on workers

● Not quite enough● Nested closures may reference entire outer

scope● May pull in non-Serializable variables not

used inside● Solution: bytecode analysis + reflection



27/29


Interactive Spark

● Modified Scala interpreter to allow Spark to be used interactively from the command line

– Altered code generation to make each “line” typed have references to objects it depends on

– Added facility to ship generated classes to workers

● Enables in-memory exploration of big data



28/29


Spark Debugger

● Debugging general distributed apps is very hard● Idea: leverage the structure of computations in

Spark, MapReduce, Pregel, and other systems● These split jobs into independent, deterministic

tasks for fault tolerance; use this for debugging:– Log lineage for all RDDs created (small)– Let user replay any task in jdb, or rebuild any

RDD



29/29


Conclusion

● Spark’s RDDs offer a simple and efficient programming model for a broad range of apps

● Achieve fault tolerance efficiently by providing coarse-grained operations and tracking lineage

● Can express many current programming models

www.spark-project.org



http://www.spark-project.org/

Spark Engine - ut · 3/29 Artjom Lind ([email protected]) 22.10.2013 Motivation … which is...

Documents

Transcript of Spark Engine - ut · 3/29 Artjom Lind ([email protected]) 22.10.2013 Motivation … which is...