Environments for eScience on Distributed Infrastructures
Environments for eScience on Distributed Infrastructures Marian
Bubak Department of Computer Science and Cyfronet AGH University of
Science and Technology Krakow, Poland http://dice.cyfronet.pl
Informatics Institute, System and Network Engineering University of
Amsterdam University of Amsterdam www.science.uva.nl/~gvlam/wsvlam/
www.science.uva.nl/~gvlam/wsvlam/
Slide 2
Bartosz Balis Tomasz Bartynski Eryk Ciepiela Wlodek Funika
Tomasz Gubala Daniel Harezlak Marek Kasztelnik Maciej Malawski Jan
Meizner Piotr Nowakowski Katarzyna Rycerz Bartosz Wilk Adam Belloum
Mikolaj Baranowski Reggie Cushing Spiros Koulouzis Michael Gerhards
Jakub MoscickiCoauthors dice.cyfronet.pl
www.science.uva.nl/~gvlam/wsvlam
Slide 3
Recent trends Enhanced scientific discovery is becoming
collaborative and analysis focused; in-silico experiments are more
and more complex Available compute and data resources are
distributed and heterogeneous Main goal Optimal usage of
distributed resources (e-infrastructures, ubiquitous) for complex
collaborative scientific applications Motivation and main goal
Slide 4
(2) Experiment Prototyping: Design experiment workflows Develop
necessary components (3) Experiment Execution: Execute experiment
processes Control the execution Collect and analysis data (4)
Results Publication: Annotate data Publish data Shared repositories
A. Belloum, M.A. Inda, D. Vasunin, V. Korkhov, Z. Zhao, H.
Rauwerda, T. M. Breit, M. Bubak, L.O. Hertzberger: Collaborative
e-Science Experiments and Scientific Workflows, Internet Computing,
July/August 2011 (Vol. 15, No. 4), pp. 39-47 Collaborative eScience
experiments (1) Problem investigation: Look for relevant problems
Browse available tools Define the goal Decompose into steps
Slide 5
Cloud Applications Stream oriented applications Data parallel
application Parameter sweep applications Infrastructure Desktops
Clusters Grids Clouds Storage Federated Cloud Storage Hbase Scaling
Automatic Task farming for grid jobs and web services MapReduce
Provenance Open Provenance model Xml history Tracing Provenance
workflow www.science.uva.nl/~gvlam/wsvlam/ System under research
Repository
Slide 6
Investigating applicability of distributed computing
infrastructures (DCI; clusters, grids, clouds) for complex
scientific applications Optimization of resource allocation for
applications on DCI Resource management for services on
heterogeneous resources Urgent computing scenarios on distributed
infrastructures Billing and accounting models Procedural and
technical aspects of ensuring efficient yet secure data storage,
transfer and processing Methods for component dependency
management, composition and deployment Information representation
model for DCI federation platforms, their components and operating
procedures Research objectives