Bioinformatica: introduzione (BMR Genomics) - Lezione 25 luglio 2014

Post on 27-Jun-2015

294 views 0 download

Tags:

description

Ripasso della shell, lancio di alcuni programmi bioinformatici (BWA, samtools), visualizzazione file con IGV. E primi passi in Perl.

Transcript of Bioinformatica: introduzione (BMR Genomics) - Lezione 25 luglio 2014

Bioinformatics TrainingIntroduction to Perl Programming

Andrea Telatin

Andrea TelatinBecoming a Bioinformatician

We started with…

1. Most bioinformatics file formats are text files!

2. There are quite a few robust programs

3. Our goal is often to create a pipeline

4. For most pipelines we need some glue

Re-loading…

Re-loading…

BASH COMMANDS

BIO TOOLS

PERL

Andrea TelatinBecoming a Bioinformatician

Playing with the BASH

• Example 1:

• Download 3 to 5 PNG images from the web

• Install the program “ImageMagik” using the repository (apt-get…)

BASH COMMANDS

BIO TOOLS

Andrea TelatinBecoming a Bioinformatician

Playing with the BASH

• Move to the download images directory

• Type “convert -resize 50% image1.png small1.png”

• How to automate the process to create a smaller version of all the images?

BASH COMMANDS

BIO TOOLS

Andrea TelatinBecoming a Bioinformatician

Playing with the BASH

• Remember of the “man command”

• Never forget about google

BASH COMMANDS

BIO TOOLS

Andrea TelatinBecoming a Bioinformatician

Playing with the BASH• Now you should create a directory for today’s tasks

• Then download into it (using wget):

• http://www.telatin.com/reads.tar.gz

• http://www.telatin.com/amplicon.tar.gz

• The human chromosome 2 (hg19)

BASH COMMANDS

BIO TOOLS

Andrea TelatinBecoming a Bioinformatician

Reads alignments• Extract the .tar.gz archives using tar. Check via

google how to do this (tar is a strange program)

• Now we have to install bwa to align reads. We can use the repository again.

BASH COMMANDS

BIO TOOLS

PERL

Andrea TelatinBecoming a Bioinformatician

Reads alignments• Create an index: bwa index genome.fa

• Align reads: bwa mem genome.fa reads.fastq > output.sam

BASH COMMANDS

BIO TOOLS

PERL

Andrea TelatinBecoming a Bioinformatician

SAMtools• Download them via the repository

• SAM to BAM pipeline: • samtools view -bS file.sam > file.bam • samtools sort file.bam sorted_file • samtools index sorted_file.bam

BASH COMMANDS

BIO TOOLS

PERL

Andrea TelatinBecoming a Bioinformatician

IGV• DON’T Download it via the repository. Download it

from the internet!

• Unzip it into a directory (eg: IGV in your home)

• Launch it with the terminal: “sh igv.sh”

BASH COMMANDS

BIO TOOLS

PERL

Andrea TelatinBecoming a Bioinformatician

IGV

• Load as genome the human chromosome 2

• Load as tracks both the BED and the BAM files

BASH COMMANDS

BIO TOOLS

PERL

Andrea TelatinBecoming a Bioinformatician

BASH COMMANDS

BIO TOOLS

PERL

BED/GFF

VCF

BAM

Programming:an introduction

BASH COMMANDS

BIO TOOLS

PERL

INPUT (files, parameters) ELABORATION (steps to transform the input) !OUTPUT (files, text…)

INPUT (files, parameters) ELABORATION (steps to transform the input) !OUTPUT (files, text…)

Try thinking about “grep” or “head”

Imitare i programmi della shell è un buon modo per farne!di validi.!I programmi della shell:

• Hanno una guida (documentazione) • Hanno dei comportamenti standardizzati (si imparano

in fretta una volta imparati questi standard) • Sono robusti (controllano l’input, danno errori che ci

aiutano a lanciarli correttamente)

Programmare significa saper scomporre il !nostro obiettivo in passaggi !che un computer possa effettuare.

Programming:a live introduction

http://www.codepad.org/

Possiamo usare per le piccole “prove” un sistema online per interpretare codice Perl