Deep Learning Models at Scale with Apache Spark · Workarounds (a.k.a hacks) Limit Spark task slots...

Deep Learning Models at Scale with Apache SparkTM

Joseph K. Bradley (speaker), Xiangrui MengMay 31, 2019ASA SDSS

About me

Joseph Bradley

• Software Engineer at Databricks• Apache Spark committer & PMC member

About Databricks

Started Spark project (now Apache Spark) at UC Berkeley in 2009

PRODUCTUnified Analytics Platform

MISSIONMaking Big Data Simple

Try for free today.databricks.com

Apache Spark + AI

Runtime DeltaSpark Core Engine

Big Data ProcessingETL + SQL +Streaming

Machine LearningMLlib + SparkR

Apache Spark: The First Unified Analytics Engine

and many more...

Internet of ThingsDigital Personalization

Disruptive innovations are affecting enterprises across the planet

Healthcare and Genomics Fraud Prevention

AI is re-shaping the world

Better AI needs more data

When AI goes distributed ...

Larger datasetsèMore need for distributed trainingèMore open-source offerings: distributed TensorFlow,

Horovod, distributed MXNet

This is where Spark and AI meet.

Challenges

Two user stories

As a data scientist,

● I can build a pipeline that fetches training data from a production data warehouse and fits a DL model at scale.

● I can apply my DL model to a distributed data stream and augment the data stream with predicted labels.

Distributed training

data warehouse

load fit model

Required: Read from Databricks Delta, Parquet, MySQL, Hive, etc.

Answer: Apache Spark

Required: distributed GPU cluster for fast training

Answer: Horovod, Distributed TensorFlow, etc.

Two separate data and AI clusters?

load using a Spark cluster

fit on a GPU cluster

modelsave data

required: glue code

Streaming model inference

Kafka load predict model

required:● save to stream sink● GPU for fast inference

A hybrid Spark and AI cluster?

load using a Spark cluster w/ GPUs

fit a model distributedlyon the same cluster

predict w/ GPUs as a Spark task

Unfortunately, it doesn’t work out of the box.

Project Hydrogen

Big data for AI

There are many efforts from the Spark community to integrate Spark with AI/ML frameworks:● (Yahoo) CaffeOnSpark, TensorFlowOnSpark● (Intel) BigDL● (John Snow Labs) Spark-NLP● (Databricks) TensorFrames, Deep Learning Pipelines,

spark-sklearn● … 80+ ML/AI packages on spark-packages.org

Project Hydrogen to fill the major gaps

Barrier Execution Mode

Accelerator Aware Scheduling

Optimized Data Exchange

In this talk

Within Apache Spark:● Current status of Project Hydrogen features● Development of new features

From Databricks:● Our use of Project Hydrogen features● Lessons learned and best practices

Story #1:Distributed training

fit a model on the same cluster

Project Hydrogen

Different execution models

Task 1

Task 2

Task 3

Spark (MapReduce)Tasks are independent of each other

Embarrassingly parallel & massively scalable

Distributed trainingComplete coordination among tasks

Optimized for communication

Task 1

Task 2 Task 3

Barrier execution mode

Gang scheduling on top of MapReduce execution modelè A distributed DL job can run as a Spark job.

● Coordination: start all tasks together● Context: provides info to coordinate across tasks● Failure handling: cancel and restart all tasks in case of failure

JIRA: SPARK-24374 (Spark 2.4)

Barrier mode integration

Horovod (an LF AI hosted project)

Horovod is a distributed training framework for TensorFlow, Keras, PyTorch, and MXNet.● Little modification to single-node code● High-performance I/O via MPI and NCCL

Developed at Uber, now an LF AI hosted project at Linux Foundation.

Hydrogen integration with Horovod

HorovodRunner: in Databricks Runtime 5.0 ML● Runs Horovod under barrier execution mode.● Hides cluster setup, scripts, MPI command line from users.

def train_hvd():hvd.init()… # train using Horovod

HorovodRunner(np=2).run(train_hvd)

Implementation of HorovodRunner

● Pickle and broadcast the train() function.● Launch a Spark job in barrier execution mode.● In the first executor, use worker addresses to launch the

Horovod MPI job.● Terminate Horovod if the Spark job got cancelled.

Collaboration on Horovod + Spark

Engineers at Uber and Databricks are improving this integration:● Merge design and code development into horovod.spark.● HorovodRunner uses horovod.spark implementation with

extra Databricks-specific features.● Support barrier execution mode and GPU-aware scheduling.

Stay tuned for future announcements!

Project Hydrogen

Accelerator-aware scheduling

Accelerators (GPUs, FPGAs) are widely used for accelerating specialized workloads like deep learning and signal processing.

Spark is currently unaware of GPUs & other accelerators.

JIRA: SPARK-24615 (ETA: Spark 3.0)

Consider a simple case where one task needs one GPU:

Why does Spark need GPU awareness?

Executor 0

Task 0

Task 1

Executor 1

Task 2

Task 3

Task 4 ?

Workarounds (a.k.a hacks)

Limit Spark task slots per node to 1.○ The running task can safely claim all GPUs on the node.○ It might lead to resource waste if the workload doesn’t need all GPUs.○ User also needs to write multithreading code to maximize data I/O.

Let running tasks themselves to collaboratively decide which GPUs to use, e.g., via shared locks.

User-facing API

User can retrieve assigned GPUs from task context (#24374)

context = TaskContext.get()assigned_gpu = context.getResources()[“gpu”][0]

with tf.device(assigned_gpu):# training code ...

Cluster manager support

SPARK-27361

Kubernetes

SPARK-27362

SPARK-27363

Standalone

SPARK-27361

Future features in discussion

● FPGAs and other accelerators● Resource request at task level● Fine-grained scheduling within one GPU● Affinity and anti-affinity● ...

Story #2:Streaming model inference

predict w/ GPUs as a Spark task

Project Hydrogen

Optimized data exchange

None of the integrations are possible without exchanging data between Spark and AI frameworks. And performance matters.

JIRA: SPARK-24579

Pandas User-Defined Function (UDF)Pandas UDF was introduced in Spark 2.3• Pandas for vectorized

computation• Apache Arrow for data

exchange

Pandas UDF for distributed inference

Pandas UDF makes it simple to apply a model to a data stream.

@pandas_udf(...)def predict(features):...

spark.readStream(...) \.withColumn(‘prediction’, predict(col(‘features’)))

Support for complex return types

We improved scalar Pandas UDF to return StructTypes.E.g., predicted labels and raw scores together

JIRA: SPARK-23836 (Spark 3.0)

@pandas_udf(...)def predict(features):

# ...return pd.DataFrame({'labels': labels, 'scores': scores})

Data pipelining

CPU GPU

t1 fetch batch #1

t2 process batch #1

t3 fetch batch #2

t4 process batch #2

t5 fetch batch #3

t6 process batch #3

CPU GPU

t1 fetch batch #1

t2 fetch batch #2 process batch #1

t3 fetch batch #3 process batch #2

t4 process batch #3

(pipelining)

Pandas UDF prefetch

Goal: improve throughputPrefetch next Arrow record batch while executing the Pandas UDF on the current batch.● Up to 2x for I/O and compute balanced workloads● Observed 1.5x in real workloads

Enabled by default on Databricks Runtime 5.2.JIRA: SPARK-27569 (ETA: Spark 3.0)

Per-batch initialization overheadLoading a model per batch introduces overhead.

New Pandas UDF interface: Load model once. Then iterate over batches.

JIRA: SPARK-26412 (WIP)

@pandas_udf(...)def predict(batches):model = … # load model oncefor batch in batches:yield model.predict(batch)

Standardize on the Arrow format

Many accelerated computing libraries now support Arrow.Proposal: Expose the Arrow format in a public interface.● Simplify data exchange.● Reduce data copy/conversion overhead.● Allow pluggable vectorization code.

JIRA: SPARK-27396 (pending vote)

Project Hydrogen

Getting started

Deep Learning on Apache Spark• Doc sources: docs.databricks.com,

github.com/horovod/horovod• Topics: HorovodRunner, horovod.spark, Pandas UDFs

Project Hydrogen• Apache Spark JIRA & dev mailing list• spark.apache.org

Acknowledgements

● Many ideas in Project Hydrogen are based on previous community work: TensorFrames, BigDL, Apache Arrow, Pandas UDF, Spark GPU support, MPI, etc.

● We would like to thank many Spark committers and contributors who helped the project proposal, design, and implementation.

Acknowledgements

● Xingbo Jiang● Thomas Graves● Andy Feng● Alex Sergeev● Shane Knapp● Xiao Li● Li Jin● Bryan Cutler● Takuya Ueshin● Wenchen Fan● Jason Lowe

● Hyukjin Kwon● Madhukar Korupolu● Robert Evans● Yinan Li● Felix Cheung● Imran Rashid● Saisai Shao● Mark Hamstra● Sean Owen● Yu Jiang● … and many more!

Thank You!Questions?

Deep Learning Models at Scale with Apache Spark · Workarounds (a.k.a hacks) Limit Spark task slots...

Documents

Transcript of Deep Learning Models at Scale with Apache Spark · Workarounds (a.k.a hacks) Limit Spark task slots...

Task Scheduling in Wireless Sensor Networks · Task Scheduling in Wireless Sensor Networks ... take into account energy constraints, compatibility of tasks to a given node or topology

Research Article Analysis of Plant Breeding on Hadoop and ...downloads.hindawi.com/journals/aag/2016/7081491.pdf · SparkSQL Spark Streaming MLlib HDFS Cluster node Cluster node Cluster

Task recorder update - download.microsoft.com · TASK RECORDER UPDATE 2. Select the root node, click New node, enter the name and description of the task to record, and then click

14122018 Lessons Learned from Building an Kafka Managed ... · 1.Xxxxxxxx xxxxxx Introduction • Over 30 million node-hours of experience managing Cassandra, Spark, Elasticsearch

Building 1000 Node Spark Cluster on EMR

Spark & Spark SQL

Mazda RX-8 Spark Plug and Spark Plug Wire Install Guide...plug wires on your 2004+ rotary powered Mazda RX-8. While changing spark plugs on most vehicles is an easy task for some,

Addendum Session 5 - Amazon S3 · KafkaRDDs indicate Kafka-Spark partition should get data from the machine hosting the Kafka topic Spark Streaming - partitions are local to the node

· (Page views ? Hourly? Monthly Hadoop Node Hadoop Node Hadoop Camus Node Hadoop Node Hadoop Node Hadoop Node Hadoop Node Hadoop Node Ad-Hoc Analysis External Datastores Trends

Study Guide 10 Physical Sciencesassets.cambridge.org/97805217/05776/frontmatter/...node f = 2f 1 node f = 3f 1 node f = 4f 1 node node node node node Strings attached to a ﬁ xed

© 2017 The MathWorks, Inc.€¦ · Run MATLAB scripts on SPARK & HADOOP Master Name Node Worker Nodes Hadoop & Spark Library HDFS YARN Resource Data Nodes Manager Edge Node Spark-submit

Cluster architecture · MR (v2) Architecture 7 client node resource manager node other node manager nodes client JVM task JVM MR program job ResourceManager HDFS YARN NodeManager

Scaling Spark on HPC Systems · Scaling Spark on HPC Systems ... iterative algorithms accelerated Spark’s adoption. ... the Spark runtime assigns the task of moving objects across

ACCELERATE YOUR SPARK WITH INTEL OPTANE DC … · 2019-09-24 · Local Storage (HDD) Spark Executor Spark Gateway SQL (e.g. ThriftServer, Spark shell) Cached Data Source (v1/v2) Task

KNIME Extension for Apache Spark Installation Guide · • a cluster-side REST service, that needs to be installed on an edge node of your Hadoop cluster or a node that can execute

Large Graph Mining: Power Tools and a Practitioner’s guidectsourak/KDD09/FOILS/task2_communities.pdf · • Task 1: Node importance • Task 2: Community detection • Task 3: Recommendations

Cisco Spark Hybrid Media new media node to organization Utilize local media node for meetings Configure password and network parameters Authenticate node to organization Demo of installation

Why Scala? Why Spark? - Miller€¦ · Why Scala? Why Spark? That is, by working in Scala, in a functional style, you can quickly scale your problem from one node to tens, hundreds,

Physical Models, Post Processing Michael!A.!Gallis! · 2015. 9. 10. · SPARTA!Benchmarking! 16 cores/node 1 task/core 16 cores/node 4 tasks/core • Weak scaling indicates, 10% peak

Cisco Spark service - Amazon Web Servicesclnv.s3.amazonaws.com/2017/eur/pdf/BRKCOL-1120.pdf · on CUCM and Cisco Spark App BRKCOL-1120 18 ... CTI/AXL HTTPS Hybrid Media Node Hybrid