Introduction to Distributed Optimizationrezab/dao/notes/intro_to_dao.pdf2) Lazily transform them to...

Reza Zadeh

Introduction to Distributed Optimization

@Reza_Zadeh | http://reza-zadeh.com

Key IdeaResilient Distributed Datasets (RDDs)»Collections of objects across a cluster with user

controlled partitioning & storage (memory, disk, ...)»Built via parallel transformations (map, filter, …)»The world only lets you make make RDDs such that

they can be:

Automatically rebuilt on failure

Life of a Spark Program1) Create some input RDDs from external data or

parallelize a collection in your driver program.

2) Lazily transform them to define new RDDs using transformations like filter() or map()

3) Ask Spark to cache() any intermediate RDDs that will need to be reused.

4) Launch actions such as count() and collect() to kick off a parallel computation, which is then optimized and executed by Spark.

Example Transformationsmap() intersection() cartesion()

flatMap() distinct() pipe()

filter() groupByKey() coalesce()

mapPartitions() reduceByKey() repartition()

mapPartitionsWithIndex() sortByKey() partitionBy()

sample() join() ...

union() cogroup() ...

Example Actionsreduce() takeOrdered()

collect() saveAsTextFile()

count() saveAsSequenceFile()

first() saveAsObjectFile()

take() countByKey()

takeSample() foreach()

saveToCassandra() ...

PairRDDOperations for RDDs of tuples (Scala has nice tuple support)https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.PairRDDFunctions

groupByKeyAvoid using it –

use reduceByKey

Guide for RDD operationshttps://spark.apache.org/docs/latest/programming-guide.html

Browse through this.

Communication Costs

MLlib: Available algorithmsclassification: logistic regression, linear SVM,naïve Bayes, least squares, classification treeregression: generalized linear models (GLMs), regression treecollaborative filtering: alternating least squares (ALS), non-negative matrix factorization (NMF)clustering: k-means||decomposition: SVD, PCAoptimization: stochastic gradient descent, L-BFGS

OptimizationAt least two large classes of optimization problems humans can solve:

» Convex» Spectral

Optimization Example: Gradient Descent

ML Objectives

Scaling1) Data size

2) Model size

3) Number of models

Logistic Regressiondata = spark.textFile(...).map(readPoint).cache()

w = numpy.random.rand(D)

for i in range(iterations):gradient = data.map(lambda p:

(1 / (1 + exp(-p.y * w.dot(p.x)))) * p.y * p.x).reduce(lambda a, b: a + b)w -= gradient

print “Final w: %s” % w

Separable UpdatesCan be generalized for

» Unconstrained optimization» Smooth or non-smooth

» LBFGS, Conjugate Gradient, Accelerated Gradient methods, …

Logistic Regression Results

1000150020002500300035004000

1 5 10 20 30

Number of Iterations

HadoopSpark

110 s / iteration

first iteration 80 sfurther iterations 1 s

100 GB of data on 50 m1.xlarge EC2 machines

Behavior with Less RAM68

0% 25% 50% 75% 100%

% of working set in memory

Introduction to Distributed Optimizationrezab/dao/notes/intro_to_dao.pdf2) Lazily transform them to...

Documents

Transcript of Introduction to Distributed Optimizationrezab/dao/notes/intro_to_dao.pdf2) Lazily transform them to...

EDUCARNIVAL 2014 at IIT Delhi - Lazily productive sapna agrawal

I & Dao

Distributed graph queries over models@run.time for runtime … · 2020-02-03 · data distribution service (RDDS [35]). RDDS is an exten-sion for the DDS standard [50] of the Object

DaoistEcological’’ TempleHandbook’ · PDF fileDaoistEcological’’ TempleHandbook’!!! The!Dao!followsNature!!! “The!Dao!De!Jingproposesthat‘theDaoistheoriginofeverything’andsetsthesceneforthe!

LAG: Lazily Aggregated Gradient for Communication ...06-15-30)-06-15-40-12765-LAG:... · 1 TianyiChen Georgios Giannakis Tao Sun WotaoYin UMN, ECE UCLA, Math NeurIPS2018 LAG: Lazily

SparkSQL: A Compiler from Queries to RDDs

DAO 2007-01

The DAO Listing A Collection of Past and Present DAO Projects

Pieces DAO

Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDs

Delivering as One (DaO) Papua New Guinea. Index 1.Background PNG 2.Development Impact DaO 3.DaO structure 4.DaO and the Paris Declaration 5.Lessons Learned.

Decoding Dao - download.e-bookshelf.de · Dao De Jing and the Zhuangzi—Summary 156 Section Three: Developing Dao Chapter Nine The School of Zhuangzi and Followers of the Dao De

Dao 2012-07

Klingon D7 Battlecruiser can be seen lazily turning about ...wizkidsgames.com/wp-content/uploads/fleet/star... · Klingon D7 Battlecruiser can be seen lazily turning about for what

RDDS measurements at RITU and prospects at HIE-ISOLDE

Lazily Finding Holes Without Breaking The Law - Lazily... · 2019. 2. 11. · Event –OWASP Day 2015 Date –27th February 2015. Company Overview Company – Lateral Security (IT)

Dao example

Dao ogefrem

ADVERB TEST Mrs. Phillips Language Arts 1. Lazily, the dog got up and walked to his water bowl. Which is the adverb? a. Lazily b. Got up c. Walked d.

Amendment DAO 2012_07