UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX...

21
UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX [email protected]

Transcript of UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX...

Page 1: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.

UPPMAX and UPPNEX:Enabling high performance bioinformatics

Ola Spjuth, UPPMAX

[email protected]

Page 2: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.

High-performance bioinformatics

• Trivial/embarrassingly parallelizable– Mass of individual tasks (or divide up problems),

run in parallel– E.g. analyze several sequences

• Non-trivial parallelism– Single task on many processors (data partitioning)– Example: Molecular dynamics

Page 3: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.

Resources for high-performance computing (HPC)

• Supercomputers– “a computer at the frontline of current processing

capacity, particularly speed of calculation”

• Clusters– Processors in close proximity

• GRID computing– Distributed systems, (joined clusters)

Page 4: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.
Page 5: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.
Page 6: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.
Page 7: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.
Page 8: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.
Page 9: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.

UPPMAX

• Uppsala university’s resource for high performance computing (HPC) and related know-how– Computational clusters

• 6000 cores

– Storage• 1.4 PB parallel storage

Page 10: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.

• A project at UPPMAX• 13,152 MSEK from KAW/SNIC (2008-12-30)

• ~1 M cpuh/month on a shared cluster (kalkyl)• ~1 PB cluster-attached parallel storage (bubo)• Long term storage on SweStore (>1 PB)• SMP machine, 64 core, 2TB RAM (halvan)

Page 11: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.

The cluster kalkyl

• 348 nodes with 8 cores each– 324 nodes with 24 GB– 16 nodes with 48 GB– 16 nodes with 72 GB– Total: 2784 cores

• SLURM queuing system

Page 12: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.

UPPNEX data flow

Page 13: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.

Knowledge Base / Community website

www.uppnex.uu.se

Page 14: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.

UPPNEX Application Experts

• Assist with NGS Analysis

• Available via

mailing-list or by

direct contact

Page 15: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.

Project growth

Page 16: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.

UPPNEX storage usage

Page 17: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.

Used CPU core h / month

1 week maintenance stop for move to new computer hall

Page 18: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.

A typical day at UPPMAX

Page 19: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.

UPPNEX software used

Page 20: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.

Conclusions:

Community needs (storage)

• Access to high-availability storage

• Access to long term storage

• Sustainable file infrastructure

Page 21: UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX ola.spjuth@farmbio.uu.se.

• Support new types of HPC users and usage

• Keep up with the bioinformatics software flood

• Managing data growth (previously only computations)

Conclusions:

UPPNEX main challenges