Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste –...

19
Sebastian Oeste ([email protected]) Holger Brunst ([email protected]) Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH) Practical I/O-Analysis EIUG-Workshop 26.9.2017 Hamburg

Transcript of Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste –...

Page 1: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

Sebastian Oeste ([email protected])

Holger Brunst ([email protected])

Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH)

Practical I/O-Analysis

EIUG-Workshop 26.9.2017 Hamburg

Page 2: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

Agenda

1 Motivation

2 Linux I/O

3 Score-P / Vampir I/O-Analysis

Sebastian Oeste – EIUG-Workshop 2017 2

Page 3: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

HPC I/O State of the Art

Image: www.nics.tennessee.edu

3Sebastian Oeste – EIUG-Workshop 2017

Page 4: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

HPC systems today HPC systems of the future

CPU

Memory NVRAM

Spinning storage disk

Register

Cache

Memory & Storage Latency Gaps

Storage tape

1x

100,000x

10x

10x

10,000x

DRAM

Storage SSD

CPU

Register

Cache

1x

10x

10x

DRAM

Spinning storage disk

Storage disk - MAID

Storage tape

10x

100x

100x

1,000x

10x

socket socket

socket

socket

socket

socket

DIMM

DIMM

DIMM

IO

IO

backup

IO

backupbackup

New memory technology

Changes the memory hierarchy we have

Impact on applications e.g. simulations?

I/O performance is one of thecritical components forscaling applications

New members in the memory hierarchy

4Sebastian Oeste – EIUG-Workshop 2017

Page 5: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

The Linux I/O Stack

Sebastian Oeste – EIUG-Workshop 2017 5

Image: https://www.thomas-krenn.com/en/wiki/Linux_Storage_Stack_Diagram

Page 6: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

Tools

Sebastian Oeste – EIUG-Workshop 2017 6

Image: http://www.brendangregg.com/Perf/linux_perf_tools_full.png

Page 7: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

Application

System Libraries

System Call Interface

VFS

Block Layer

File System (ext4, btrfs, ...)

userspace

kernelspace

System LibrariesMPI/HDF5/ADIOS

Application & System Libraries: gprof – GNU Profiler ltrace – trace library calls uprobes – dynamic userspace tracepoints

System Call Interface: strace – trace syscalls w. ptrace() sysdig – needs kernel module perf – use Kernel trace events

VFS: lsof – list open files pcstat

File System: perf – as swiss army knife Fs specific tools

Block Layer: iostat iotop blktrace

Sebastian Oeste – EIUG-Workshop 2017 7

Diferent layer, diferent tool!

Page 8: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

Score-P & Vampir

8Sebastian Oeste – EIUG-Workshop 2017

Page 9: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

Tapping I/O Layers

• I/O layers– Lustre File System

• Client side• Server side

– Kernel – POSIX– MPI-I/O– HDF5– NetCDF– PnetCDF– ADIOS

9Sebastian Oeste – EIUG-Workshop 2017

Page 10: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

Individual I/O Operatin

I/O Runtme Cintributin

I/O operations over time

10Sebastian Oeste – EIUG-Workshop 2017

Page 11: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

I/O Data Rate if single thread

I/O data rates over time

11Sebastian Oeste – EIUG-Workshop 2017

Page 12: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

I/O summaries with totals

Other Metrics:• IOPS• I/O Time• I/O Size• I/O Bandwidth

12Sebastian Oeste – EIUG-Workshop 2017

Page 13: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

I/O summaries per fle

Aggregated data fir specifc resiurce

13Sebastian Oeste – EIUG-Workshop 2017

Page 14: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

I/O operations per fle

Ficus in specifc resiurce

Shiw all resiurces

14Sebastian Oeste – EIUG-Workshop 2017

Page 15: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

• Score-P does not record I/O data by default– Score-P wrapper

• see option --io=help • has variants

– Score-P installation• default if I/O libraries are detected correctly

Tell Score-P to record I/O data

15Sebastian Oeste – EIUG-Workshop 2017

Page 16: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

Select I/O layer of interest

• scorep --io=netcdf --io=posix

• --io=– mpi– none– posix– netcdf– netcdf_par– hdf5

16Sebastian Oeste – EIUG-Workshop 2017

Page 17: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

• scorep --io=runtime:netcdf --io=linktime:posix

• runtime:– I/O calls are instrumented during binary loading– reveals even internal I/O in libraries,

• e.g. NetCDF doing POSIX– requires --dynamic link option in scorep

• linktime: (default)– I/O calls are instrumented when linking– reveals direct calls to I/O only

• e.g. your code doing MPI-IO but not the I/O underneath

Optionally set library wrapping method

17Sebastian Oeste – EIUG-Workshop 2017

Page 18: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

• --static– symbols are resolved during compile and link time– user calls to I/O libraries are recorded– internal I/O in libraries not recorded

• if library is not compiled with scorep

• --dynamic– symbols are resolved loading binary into memory– needed for --io=runtime:posix

I/O data recording and static linking

18Sebastian Oeste – EIUG-Workshop 2017

Page 19: Practical I/O-Analysis · 2 Linux I/O 3 Score-P / Vampir I/O-Analysis Sebastian Oeste – EIUG-Workshop 2017 2. HPC I/O State of the Art Image: Sebastian Oeste – EIUG-Workshop 2017

PREP = scorep --dynamic --io=runtime:netcdf --io=runtime:posixCC = $(PREP) gccCFLAGS = -Wall -Wextra

instrumented: foo.c$(PREP) $(CC) $(CFLAGS) -o foo foo.c

#! /bin/bash#SBATCH -nodes=256#SBATCH -ntasks=256#SBATCH ...

export SCOREP_ENABLE_TRACING=trueexport SCOREP_ENABLE_PROFILING=falseexport SCOREP_TOTAL_MEMORY=256MBexport SCOREP_METRIC_RUSAGE=ru_stime,ru_inblock,ru_oublock

srun -n 256 ./your-app

In your batch file:

In your makefile:

How to use Score-P for your application?

19Sebastian Oeste – EIUG-Workshop 2017