Past, present, and future of HPC in life sciences

19
The past, present, and future of HPC in life sciences Erich Birngruber, Ümit Seren Gregor Mendel Institute for Molecular Plant Biology (GMI) AHPC17

Transcript of Past, present, and future of HPC in life sciences

The past, present, and future of HPC in life sciences

Erich Birngruber, Ümit Seren

Gregor Mendel Institute for Molecular Plant Biology (GMI)

AHPC17

Who we are

- Basic research institute in plant sciences

- 9 independent research groups

- Employees 100 + 20 (scientific + admin)

- HPC Operations Team: 2 + 1 (engineer + lead)

Past: Beginnings as traditional HPC

Scientific computing at GMI

- Started in 2010- SGI ICE-X since 2013 (MENDEL)

(72 nodes, 144 today)- SGI UV2000- Rich software environment

(EasyBuild, lmod)- Keeping up with current

developments

Machine specs

3 generations of nodes:

- 72x 16c E5-2609, 192gb mem- 18x 20c E5-2680, 256gb mem- 54x 24c E5-2650, 256gb mem,

230gb ssd

UV2000: 96c E5-4617, 2tb mem

IB FDR interconnect (1 fabric)

Storage: Lustre 300tb, NetApp >1pb

Past: System architecture

Present: GMI site specifics

- Services: customers are biologists

- On campus initial training

- Consulting and support (w/ ticket system, intranet wiki)

- Software installations

- Provided as modules: different versions, repeatability

- This is getting harder with the demand for more complex software

- Monitoring software usage

Present: Monitoring software usage

- Software in env modules

- 460 software packages

in 1297 versions

- Monitoring module usage

(load, unload)

- Reporting by user, job, project

Present: Monitoring system activity

Monitoring and metrics

The foundation for all future decisions

- Resource consumption

- Capacity planning

- Software, technology usage

- Auditing

Alerting

No

de

stat

us

Job

res

ou

rces

Present: Applications & Appliances

Phenobox (in development)

- Web-interface, API

- MySQL (DB)

- DSLR, RaspberryPi

- HPC (computer vision, storage)

GWA-Portal (https://gwas.gmi.oeaw.ac.at)

- Web-interface, API

- Elasticsearch (fulltext search)

- PostgreSQL (DB)

- Docker (Python microservices)

- HPC (analysis, storage)

Galaxy (https://galaxyproject.org/)

- Web-interface, API

- MySQL (DB)

- Visualization

- HPC (analysis, storage)

PacBio SMRT Link

(https://github.com/PacificBiosciences/SMRT-Link)

- Web-interface, API

- MySQL (DB)

- HPC (analysis, storage)

Own developments: 3rd party software:

Present: new developments

Deployment of OpenStack (IaaS):

- Cross-vendor open source project- On-premises cloud- Provision VMs and containers- Deploy classic application services- Enables self-service for customers

Consequences:

- More heterogeneous use-cases- Customer base is increasing- Non-human “customers” of HPC- Services are more complex and

distributed over subsystems

Past: System architecture

Present: MENDEL, Openstack

Present: Problem 1: maintenance

- VMs are difficult to maintain

- Wrong abstraction for the use-case

- What is the next step?

- Containers?- Container Orchestration Engines?- Provide Software as a Service (SaaS)?

Fact is: the field is evolving

Present: MENDEL, Openstack

Future: Problem 2: integration

Applications sit on different islands:

HPC vs. Cloud

Drawbacks:

- Hard to maintain (infra)- Hard to debug (app)

Vision: converged compute platform.

Unified infrastructure to schedule all types of tasks

New challenges:

- Networking - Storage- IDM - Accounting

What do others do?

Container Orchestration Engine (Google Kubernetes, Docker Swarm, Apache Mesos)

First steps:

- Containers for HPC- Biocontainers http://biocontainers.pro- Singularity http://singularity.lbl.gov- Current status: test deployment

Contact / References:

Erich Birngruber <[email protected]>, @ebirn

Ümit Seren <[email protected]>, @timeu_s

GMI on Github:

https://github.com/Gregor-Mendel-Institute

Total recall: holistic metrics for broad systems performance and user experience visibility in a

data-intensive computing environment

https://dl.acm.org/citation.cfm?id=2835001

Acknowledgements

Gregor Mendel Institute of Molecular Plant BiologyDr Bohr-Gasse 31030 Vienna, Austria

EOF