Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric...

28
Data-centric cloudification of scientific applications with many-task computing and map-reduce Silvina Ca´ ıno Lores Computer Architecture, Communication and Systems Area Department of Computer Science, University Carlos III of Madrid February 11, 2016

Transcript of Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric...

Page 1: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientificapplications with many-task computing and

map-reduce

Silvina Caıno Lores

Computer Architecture, Communication and Systems AreaDepartment of Computer Science, University Carlos III of Madrid

February 11, 2016

Page 2: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

1 Introduction

2 Data-Centric Transformation Methodology

3 Enabling Large-Scale Parallelism

4 EvaluationApplication Analysis and AdaptationExecution EnvironmentsAssessing the Cloudified ApplicationScalability Study of the Many-Task Deployment

5 Conclusions

6 The HGS Case Study

1

Page 3: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Introduction

Context: Supercomputers

High-performance computing(HPC) targets complexcomputational problems and largeamounts of data by aggregatingcomputing resources anddeveloping parallel processingtechniques.

Limited by power consumptionand hardware architecture.

Tianhe-2 supercomputer

2

Page 4: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Introduction

Context: Cloud Computing

Cloud computing relies on resource sharing and virtualization toprovide on-demand elastic resources.

Highly scalable alternative to grid/cluster infrastructures.

Abstraction layers as service models.Infrastructure-as-a-Service (Iaas): raw computing resources.Platform-as-a-Service (PaaS): computing frameworks.Software-as-a-Service (SaaS): production-ready applications.Anything-as-a-Service (XaaS): databases, networks, security,simulations...

A popular provider example: Amazon Elastic Compute Cloud(EC2).

Reddit, Twiitch, IMDb, NASA, Pinterest...

3

Page 5: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Introduction

Motivation

Scientific simulations are widely used to model real-worldphenomena.

Resource intensive.I/O and intermediate data volumes keep increasing (Lang et al.,2009).Traditionally run on HPC infrastructures (limited by underlyingresources).

One simulation is not sufficient.

Expert systems.Several variables and domains.

Cloud device aggregation can make up for lack of HPC scalability.

4

Page 6: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Introduction

Objectives

1 Migrate scientific simulators to the Cloud while retainingperformance.

2 Minimise impact in the original code.

Benefits:

Increase performance and throughput.

Address larger problems.

Reduce economical and environmental costs.

5

Page 7: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Introduction

Trends in Cloudification Techniques

Several migration options:VM bundling (Srirama et al., 2013; Yu et al., 2011; D’Angelo,2011).

Middleware overhead.

Code redesign (Srirama et al., 2012; Ibrahim et al., 2010; Zhanget al., 2015).

Expensive development.

Our proposal:

Data-centric generalist wrap deployed as a many-task framework(Caıno-Lores et al., 2015; Carretero et al., 2015).

6

Page 8: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Introduction

Proposal Overview

1 Rely on map-reduce to induce parallelism.Minimise code manipulation.Immediate data-awareness.Simulation partitioning and distribution (Caıno-Lores et al., 2014;Caıno-Lores et al., 2014).

2 Follow an MTC deployment to overlap experiments.Increased granularity and task overlapping.Better utilisation and balance (Zhang et al., 2011).Suitable for distributed scientific computing (Manuali et al., 2012;Ogasawara et al., 2009).Fits parameter-based simulations nicely (Abramson et al., 2011;Dias et al., 2010).

3 Help the user to estimate the cluster size.Minimise deadline, cost, or a trade-off.Maximise resource usage.

7

Page 9: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Data-Centric Transformation Methodology

Map-Reduce in a Nutshell

PERSISTENT STORAGE

INPUT READING

OUTPUT GENERATION

MAP REDUCESHUFFLE

PERSISTENT STORAGE

map and reduce run independently and autonomously.

8

Page 10: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Data-Centric Transformation Methodology

Transformation Procedure

DATABASE

FILES

SIMULATION PARAMETERS

SELECT INDEPENDENT VARIABLE, T

x

ADAPTATION JOB

SIMULATION JOB

DATABASE

FILES

READ INPUT DATA

(MAP)

FORMAT INPUT DATA

(REDUCE)

FORMAT OUTPUT DATA

(REDUCE)

SIMULATION KERNEL T

i(MAP)

INTERMEDIATEDATA INDEXED

BY Tx

DEFINE OUTPUT FORMAT

Partition the application: run the same simulation kernel on a fragmentof the domain.

9

Page 11: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Enabling Large-Scale Parallelism

Many-Task Approach

Better utilisation due to granularity, but depends on platform tuning.

10

Page 12: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Enabling Large-Scale Parallelism

Deployment Scheme

P1

P2

··· Pp

Experiment partitions

E1

E2

E3

··· Ee

J1

J2

··· Jj

Experiment subset

Inner jobs

T1

T2

T3

T4

··· TtJob tasks

Experiment poolProvides

Coordinator

Master

User

Slaves

Clients

Distributes

Submit

Manages

Execute

Management infrastructure

Execution infrastructure

Exploit map-reduces’s multi-tenancy to maximise resource usage.

11

Page 13: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Enabling Large-Scale Parallelism

Dimensioning Model

Select the client and slave types to meet the following objectives:

1 Balance both master-worker schemes.

Minimise the difference between the runnable tasks and schedulabletasks.

2 Optimise performance.

Maximise the number of tasks that can be run concurrently.

3 Minimise the virtual cluster’s operational costs.

12

Page 14: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Evaluation

Target Application

EXPERIMENT DBSIMULATIONPARAMETERS

TRAIN MOVEMENT FILES

SIMULATION RESULTS

ALLOCATECONSUMERS

SOLVECIRCUIT

ITERATIVEPROCESS

WRITERESULTS

Tn,i

>Tn,f

Tn,i

= Tn,0

DATA MODULE

ALGORITHMMODULE

Tn.i

++

YES

NO

SIMULATION KERNEL FOR INSTANT Tn,i

W0

Wn

WORKLOAD FOR THREAD Wn

THREADSCHEDULER

MERGEFILES

READSCENARIO

READCONSUMERS

READPARAMETERS

ONTOLOGYMODULE

Memory-bound railway electric power consumption simulator.

Relies on:

1 Description of the railway infrastructure.

2 Instantaneous train position and power demand.

13

Page 15: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Evaluation

Adaptation

MR Job 1

INPUT ADAPTATION

Instant | Parameters...

Train File 1

INPUT

MR Job 2

SIMULATION EXECUTION

File 1

File 2

File K

OUTPUT

Instant | Parameter List...

Input File 1

ADAPTED INPUT

Infrastructure File

Instant | Parameters...

Train File 2

Instant | Parameters...

Train File I

Instant | Parameter List...

Input File 2

Instant | Parameter List...

Input File J

The temporal variable becomes the independent variable, Tx .

One simulation per instant.

14

Page 16: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Evaluation

Execution Environments

Configuration Platform Infrastructure

1 Multi-thread Cluster node1

2 Hadoop 2.2.0 Cluster node

3 Hadoop 2.2.0 EC2

Type Role Virtual CPUs Memory (GB) Local storage (GB)

m1.medium master 1 3.75 410

m2.xlarge slave2 2 17.1 420

148 Xeon E7 cores and 110GB of RAM2Five slaves used to match RAM in cluster node.

15

Page 17: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Evaluation

Performance Evaluation

0

200

400

600

800

1000

1200

1400

I II III IV

Tim

e(m

)

Experiment

Total time

MRMR/EC2Original

Includes input data upload (data replication and balancing).

Performance with MR in the local node and the cloud is remarkablybetter than the original (68% and 85% less, respectively).

Platform overhead is significant with small experiments.16

Page 18: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Evaluation

Scalability

0

10

20

30

40

50

60

1 4 16 64

Sp

eed

-up

Number of experiments

Speed-up over one node

4 nodes16 nodes64 nodes

Performance does not scale up linearly with the number of nodes.

Resources become underutilised.

The infrastructure must fit the experiment pool.

17

Page 19: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Evaluation

Efficiency

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1 4 16 64

Effi

cien

cyov

eron

en

od

e

Number of experiments

Efficiency over one node

4 nodes16 nodes64 nodes

Per-node efficiency:normalised speed-up with relationto the number of slaves (Gu-narathne et al., 2011).

e =t0

nt(1)

System becomes underutilised as nodes are added with the sameexperiments.

Better efficiency with more experiments (even superlinear).

We can scale to thousands of experiments!

18

Page 20: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Conclusions

Highlights

Scientific applications require an increasing amount of computingresources.

Migrating applications to the Cloud could overcome theselimitations.

We proposed a methodology able to:1 Cloudifiy production-ready simulators.2 Maximise code reuse.3 Improve scalability.4 Support multidimensional/parametric studies efficiently.5 Minimise execution costs.

According to our results with a real application:Better performance and scalability.High efficiency and resource utilisation under heavy workload.The platform holds scalability issues on cluster configuration.

19

Page 21: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

Conclusions

Future Works

Improve the adaptation model.

Multi-key mechanisms (several independent variables).More complex base functions (partition, group-by-key...)

Study MTC in federated infrastructures (private an public, hybrid).

Expand the dimensioning model.

Stage-based.Support for spot pricing.Mix with brokering systems.

Adopt the in-memory computing perspective.

Other applications (CPU bound, network intensive...).

20

Page 22: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

The HGS Case Study

The HGS Case Study: Background

Compute-intensive MPI scientific application from thehydrogeology domain.

Kernel contained in Ensemble Kalman Filter.

Iterative in nature, but pleasingly parallel within each step.By requirement, kernel is a black box.Data can only be modified in files by an external library.

21

Page 23: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

The HGS Case Study

The HGS Case Study: Approach

Current approach:

Distribute realizations, takepost-processing as barrier (i.e.iterative MR)

Towards in-memorycomputing: Spark (advancedMR) + Tachyon (faster I/O)

Potential issues:

Fault-tolerance.

Platform and streamingoverhead.

Loss of flexible data-locality.

PRE-PROCESSING

POST-PROCESSING

INITIAL DATA

INPUT DATA

OUTPUT DATA

REALIZATION 0 REALIZATION r-1REALIZATION 1

t == Tf

t = T0

YES

NO

22

Page 24: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientificapplications with many-task computing and

map-reduce

Silvina Caıno Lores

Computer Architecture, Communication and Systems AreaDepartment of Computer Science, University Carlos III of Madrid

February 11, 2016

Page 25: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

References I

Abramson, D., Bethwaite, B., Enticott, C., Garic, S., and Peachey, T. (2011). Parameterexploration in science and engineering using many-task computing. Parallel andDistributed Systems, IEEE Transactions on, 22(6):960–973.

Caıno-Lores, S., Garcıa, A., Garcıa-Carballeira, F., and Carretero, J. (2014). A cloudificationmethodology for numerical simulations. In Euro-Par 2014: Parallel Processing Workshops- Euro-Par 2014 International Workshops, Porto, Portugal, August 25-26, 2014, RevisedSelected Papers, Part II, pages 375–386.

Carretero, J., Caıno-Lores, S., Garcıa-Carballeira, F., and Garcıa, A. (2015). Amulti-objective simulator for optimal power dimensioning on electric railways using cloudcomputing. In Proceedings of the 5th International Conference on Simulation andModeling Methodologies, Technologies and Applications, pages 428–438.

Caıno-Lores, S., Fernandez, A. G., Garcıa-Carballeira, F., and Perez, J. C. (2015). Acloudification methodology for multidimensional analysis: Implementation and applicationto a railway power simulator. Simulation Modelling Practice and Theory, 55:46 – 62.

Caıno-Lores, S., Garcıa, A., Garcıa-Carballeira, F., and Carretero, J. (2014). Breaking datadependencias in numerical simulations using mapreduce. In XXV Jornadas de Paralelismo.

Page 26: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

References II

D’Angelo, G. (2011). Parallel and distributed simulation from many cores to the publiccloud. In High Performance Computing and Simulation (HPCS), 2011 InternationalConference on, pages 14–23.

Dias, J., Ogasawara, E., de Oliveira, D., Pacitti, E., and Mattoso, M. (2010). Improvingmany-task computing in scientific workflows using p2p techniques. In Many-TaskComputing on Grids and Supercomputers (MTAGS), 2010 IEEE Workshop on, pages1–10.

Gunarathne, T., Wu, T.-L., Choi, J. Y., Bae, S.-H., and Qiu, J. (2011). Cloud computingparadigms for pleasingly parallel biomedical applications. Concurrency and Computation:Practice and Experience, 23(17):2338–2354.

Ibrahim, S., Jin, H., Lu, L., Wu, S., He, B., and Qi, L. (2010). Leen: Locality/fairness-awarekey partitioning for mapreduce in the cloud. In Cloud Computing Technology and Science(CloudCom), 2010 IEEE Second International Conference on, pages 17–24.

Lang, S., Carns, P., Latham, R., Ross, R., Harms, K., and Allcock, W. (2009). I/operformance challenges at leadership scale. In Proceedings of the Conference on HighPerformance Computing Networking, Storage and Analysis, SC ’09, pages 40:1–40:12.

Page 27: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

References III

Manuali, C., Costantini, A., Lagana, A., Cecchi, M., Ghiselli, A., Carpene, M., and Rossi, E.(2012). Efficient workload distribution bridging htc and hpc in scientific computing. InMurgante, B., Gervasi, O., Misra, S., Nedjah, N., Rocha, A., Taniar, D., and Apduhan,B., editors, Computational Science and Its Applications – ICCSA 2012, volume 7333 ofLecture Notes in Computer Science, pages 345–357. Springer Berlin Heidelberg.

Ogasawara, E., de Oliveira, D., Chirigati, F., Barbosa, C. E., Elias, R., Braganholo, V.,Coutinho, A., and Mattoso, M. (2009). Exploring many task computing in scientificworkflows. In Proceedings of the 2Nd Workshop on Many-Task Computing on Grids andSupercomputers, MTAGS ’09, pages 2:1–2:10, New York, NY, USA. ACM.

Srirama, S., Ivanistsev, V., Jakovits, P., and Willmore, C. (2013). Direct migration ofscientific computing experiments to the cloud. In High Performance Computing andSimulation (HPCS), 2013 International Conference on, pages 27–34.

Srirama, S. N., Jakovits, P., and Vainikko, E. (2012). Adapting scientific computing problemsto clouds using mapreduce. Future Generation Computer Systems, 28(1):184 – 192.

Yu, D., Wang, J., Hu, B., Liu, J., Zhang, X., He, K., and Zhang, L.-J. (2011). A practicalarchitecture of cloudification of legacy applications. In Services (SERVICES), 2011 IEEEWorld Congress on, pages 17–24.

Page 28: Data-centric cloudification of scientific applications with many … · 2017-12-14 · Data-centric cloudi cation of scienti c applications with many-task computing and map-reduce

Data-centric cloudification of scientific applications with many-task computing and map-reduce

References IV

Zhang, Z., Barbary, K., Nothaft, F. A., Sparks, E., Zahn, O., Franklin, M. J., Patterson,D. A., and Perlmutter, S. (2015). Scientific computing meets big data technology: Anastronomy use case. arXiv preprint arXiv:1507.03325.

Zhang, Z., Katz, D. S., Ripeanu, M., Wilde, M., and Foster, I. T. (2011). Ame: An anyscalemany-task computing engine. In Proceedings of the 6th Workshop on Workflows inSupport of Large-scale Science, WORKS ’11, pages 137–146, New York, NY, USA. ACM.