PORTABLE CONTAINERS ORCHESTRATION AT …qnib.org/data/hpcw19/4_SCHED_6_Nextflow.pdfPORTABLE...

Post on 15-Aug-2020

0 views 0 download

Transcript of PORTABLE CONTAINERS ORCHESTRATION AT …qnib.org/data/hpcw19/4_SCHED_6_Nextflow.pdfPORTABLE...

PORTABLE CONTAINERS ORCHESTRATION AT SCALE

WITH NEXTFLOWPaolo Di Tommaso, Seqera Labs

ISC-HPC 2019 - Frankfurt

orchestration dependencies

sharing & reproducibility Git GitHub

deployment

code

ENABLING TECHNOLOGY

WHAT DO YOU MEAN?

# satellite sequences reported by RepeatMasker. zcat rmsk.txt.gz \ | grep Satellite \ | cut -f6,7,8 \ | sed s,^chr,, \ | perl -pe 's/^[^\s_]+_([^\s_]+)_random/$1.1/' \ | tr "gl" "GL" \ | sort -k1,1N -k2,2n \ | bgzip > hs37d5.satellite.bed.gz

Credits of Heng Li, https://goo.gl/2nF5NC

process filtering { input: file 'rmsk.txt.gz' from sequences_ch output: file 'hs37d5.satellite.bed.gz' into results_ch

'''

''' }

THE NEXTFLOW WAY

Channel.fromPath('data/rmsk.txt.gz')

# satellite sequences reported by RepeatMasker. zcat rmsk.txt.gz \ | grep Satellite \ | cut -f6,7,8 \ | sed s,^chr,, \ | perl -pe 's/^[^\s_]+_([^\s_]+)_random/$1.1/' \ | tr "gl" "GL" \ | sort -k1,1N -k2,2n \ | bgzip > hs37d5.satellite.bed.gz

| filtering | publishTo { '/path' }

process filtering { input: file 'rmsk.txt.gz' from sequences_ch output: file 'hs37d5.satellite.bed.gz' into results_ch

'''

''' }

THE NEXTFLOW WAY

Channel.fromPath('data/*.txt.fq')

# satellite sequences reported by RepeatMasker. zcat rmsk.txt.gz \ | grep Satellite \ | cut -f6,7,8 \ | sed s,^chr,, \ | perl -pe 's/^[^\s_]+_([^\s_]+)_random/$1.1/' \ | tr "gl" "GL" \ | sort -k1,1N -k2,2n \ | bgzip > hs37d5.satellite.bed.gz

| filtering | publishTo { '/path' }

CONTAINERISATION• Nextflow envisioned the use

of software containers to fix computational reproducibility

• Mar 2014 (ver 0.7), support for Docker

• Dec 2016 (ver 0.23), support for Singularity

Nextflow

job job job

CONTAINERISATION• Nextflow envisioned the use

of software containers to fix computational reproducibility

• Mar 2014 (ver 0.7), support for Docker

• Dec 2016 (ver 0.23), support for Singularity

Nextflow

job job job

PORTABILITY

nextflow run your-script.nfnextflow run your-script.nf -with-docker your/image

process { executor = 'slurm' queue = 'my-queue' memory = '8 GB' cpus = 4 container = 'user/image' }

PORTABILITY

process { executor = 'awsbatch' queue = 'my-queue' memory = '8 GB' cpus = 4 container = 'user/image' }

PORTABILITY

WHO IS USING NEXTFLOW?

38members

12+institutions

20pipelines

THANK YOU

http://nextflow.io

http://seqera.io