Dynamic Parameter Allocation in Parameter Servers · DynamicParameterAllocation inParameterServers...

Dynamic Parameter Allocationin Parameter Servers

Alexander Renz-Wieland1, Rainer Gemulla2,Steffen Zeuch1,3, Volker Markl1,3

1TU Berlin, 2University of Mannheim, 3DFKI

VLDB 2020

1 / 11

TakeawaysI Key challenge in distributed Machine Learning (ML):

communication overheadI Parameter Servers (PSs)

I IntuitiveI Limited support for common techniques to reduce overhead

I How to improve support?I Dynamic parameter allocation

I Is this support beneficial?I Up to two orders of magnitude faster

2 / 11

Background: Distributed Machine LearningI Distributed training is a necessity for large-scale ML tasksI Parameter management is a key concernI Parameter servers (PS) are widely used

Logical

Parameter Server

worker workerworker ...

push()

pull()pus

h()

pull

()

push()

pull()

Physical

para

met

ers

worker

worker

worker

para

met

ers

worker

worker

worker

para

met

ers

worker

worker

worker

3 / 11

Problem: Communication OverheadI Communication overhead can limit scalabilityI Performance can fall behind a single node

Training knowledge graph embeddings (RESCAL, dimension 100):

4.5h

4h

2.4h

1.5h

Classic PS (PS−Lite)

0

100

200

1x4 2x4 4x4 8x4Parallelism (nodes x threads)

Epo

ch r

un ti

me

in m

inut

es

4 / 11



4.5h

1.2h

4h

2.4h

1.5h


Classic PSwith fast local

access

0

100

200


Epo

ch r

un ti

me

in m

inut

es

4 / 11



●

●

●●

4.5h

1.2h

4h

0.6h

2.4h

0.4h

1.5h

0.2h



access

Dynamic Allocation PS (Lapse),incl. fast local access

0

100

200


Epo

ch r

un ti

me

in m

inut

es

4 / 11

How to reduce communication overhead?I Common techniques to reduce overhead:

DATA

PARAMETERS

Data clustering

DATA

PARAMETERS

Parameter blocking

DATA

PARAMETERS

Latency hiding

I Key is to avoid remote accessesI Do PSs support these techniques?

I Techniques require local access at different nodes over timeI But PSs allocate parameters statically

5 / 11

worker 1 worker 2

parameter access

Dynamic Parameter AllocationI What if the PS could allocate parameters dynamically?

Localize(parameters)

I Would provide support forI Data clustering XI Parameter blocking XI Latency hiding X

I We call this dynamic parameter allocation

6 / 11

The Lapse Parameter ServerI Features

I Dynamic allocationI Location transparencyI Retains sequential consistency

PS per-key consistency guarantees (for synchronous operations)

Classic Lapse Stale

Eventual X X XPRAM X X XCausal X X ×Sequential X X ×Serializability × × ×

I Many system challenges (see paper)I Manage parameter locationsI Route parameter accesses to current locationI Relocate parametersI Handle reads and writes during relocationsI All while maintaining sequential consistency

7 / 11

Experimental studyTasks: matrix factorization, knowledge graph embeddings, word vectorsCluster: 1–8 nodes, each with 4 worker threads, 10 GBit Ethernet

1. Performance of Classic PSsI 2–8 nodes barely outperformed 1 node in all tested tasks

2. Effect of dynamic parameter allocationI 4–203x faster than a Classic PSs, up to linear speed-ups

3. Comparison to bounded staleness PSsI 2–28x faster and more scalable

4. Comparison to manual managementI Competitive to a specialized low-level implementation

5. Ablation studyI Combining fast local access and dynamic allocation is key

8 / 11

●

●

●●

4.5h

1.2h

4h

0.6h

2.4h

0.4h

1.5h

0.2h



access

Dynamic Allocation PS (Lapse),incl. fast local access

0

100

200


Epo

ch r

un ti

me

in m

inut

es

Comparison to Bounded Staleness PSI Matrix factorization (matrix with 1b entries, rank 100)I Parameter blocking

Single-nodeoverhead Non-linearscaling

Bounded staleness PS(Petuum), client sync.

Bounded staleness PS(Petuum), server sync.

Dynamic AllocationPS (Lapse)

●

●

●

●

0.6x

2.9x

8.4x0

10

20

30

40


Epo

ch r

un ti

me

in m

inut

es

9 / 11

Experimental studyTasks: matrix factorization, knowledge graph embeddings, word vectorsCluster: 1–8 nodes, each with 4 worker threads, 10 GBit Ethernet

1. Performance of Classic PSsI 2–8 nodes barely outperformed 1 node in all tested tasks

2. Effect of dynamic parameter allocationI 4–203x faster than a Classic PSs, up to linear speed-ups

3. Comparison to bounded staleness PSsI 2–28x faster and more scalable

4. Comparison to manual managementI Competitive to a specialized low-level implementation

5. Ablation studyI Combining fast local access and dynamic allocation is key

10 / 11

Dynamic Parameter Allocation in Parameter ServersI Key challenge in distributed Machine Learning (ML):

communication overheadI Parameter Servers (PSs)

I IntuitiveI Limited support for common techniques to reduce overhead

I How to improve support?I Dynamic parameter allocation

I Is this support beneficial?I Up to two orders of magnitude faster

I Lapse is open source:https://github.com/alexrenz/lapse-ps

11 / 11

https://github.com/alexrenz/lapse-ps

IntroBackground and ProblemDynamic Parameter AllocationExperimentsConclusion

Dynamic Parameter Allocation in Parameter Servers · DynamicParameterAllocation inParameterServers...

Documents

Transcript of Dynamic Parameter Allocation in Parameter Servers · DynamicParameterAllocation inParameterServers...

DOE & Robust Parameter DOE & Robust Parameter

Using CAD parameter sensitivities for stack-up tolerance ... · shape. Both types of tolerance allocation are active research areas. Examples of previous research in dimensional tolerancing

How do low interest rates affect asset allocation? Pension ... · Strategic asset allocation for long-term investors: parameter uncertaintyandpriorinformation, JournalofAppliedEconometrics

Using CAD parameter sensitivities for stack-up tolerance ... · PDF fileIn tolerance allocation the KPC tolerance is specified as a design ... of manual interaction [16], [17], [18],

Scaling Distributed Machine Learning with the Parameter Server · lems ranging from Sparse Logistic Regression to Latent Dirichlet Allocation and Distributed Sketching. 1 Introduction

Scaling Distributed Machine Learning with the Parameter Serverdga/papers/osdi14-paper-li_mu.pdf · lems ranging from Sparse Logistic Regression to Latent Dirichlet Allocation and

Dedicated servers by affordable servers

Heterogeneity-aware Distributed Parameter Servers · For instance, the fault tolerance mechanism to address the transient nature of spot instances. But this work focuses on the straggler

Reference Types. 2 Objectives Introduce reference types –class –array Discuss details of use –declaration –allocation –assignment –null –parameter –aggregation.

Parameter substitution ParameterMeaning $parameter or ${parameter} Substitute the value of parameter ${parameter:- value} Substitute the value of parameter.

LATENT DIRICHLET ALLOCATION. Outline Introduction Model Description Inference and Parameter Estimation Example Reference.

New Print prt2985410436085217353.tif (12 pages) · 2015. 7. 7. · • Allocation storage to AIX, Solaris, HP, Linux, RedHAT, ESX, and Windows servers and also for cluster servers.

Parameters of distribution Location Parameter Scale Parameter Shape Parameter.

Multi-Resource Fair Allocation in Heterogeneous Cloud ...weiwang/papers/weiwang-drfh.pdf · Index Terms—Cloud computing, heterogeneous servers, job scheduling, multi-resource allocation,

BIROn - Birkbeck Institutional Research Online · with load balancing, and Almeida et al. (2010) proposed joint admission control and resource allocation for virtualized servers.

Resource Scheduling through Data Allocation and Processing ...csis.pace.edu/lixin/doc/phd-dissertation/dissertation...through data allocation to improve the performance of cloud servers

Dynamic allocation of servers for large scale rendering ...

Joint Resource Allocation Algorithms for Downlink in LTE ... · Joint Resource Allocation Algorithms for Downlink in LTE and ... Huawei, a major player in ... parameter for each resource

Networking:! UDP&!DNS! · 2018-06-27 · Root DNS Servers com DNS servers org DNS servers edu DNS servers poly.edu DNS servers umass.edu DNS servers yahoo.com DNS servers amazon.com

Allocation oct2013 Allocation of Indemnity