Short read alignment BNFO 601. Short read alignment Input: –Reads: short DNA sequences usually up...

11
Short read alignment BNFO 601
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    219
  • download

    1

Transcript of Short read alignment BNFO 601. Short read alignment Input: –Reads: short DNA sequences usually up...

Short read alignment

BNFO 601

Short read alignment

• Input:– Reads: short DNA sequences usually up to 100

base pairs (bp) produced by a sequencing machine

• Reads are fragments of a longer DNA sequence present in the sample given as input to the machine

• Usually number in the millions

– Genome sequence: a reference DNA sequence much longer than the read length

Short read alignment

• Applications– Genome assembly– RNA splicing studies– Gene expression studies– Discovery of new genes– Discovering of cancer causing mutations

Short read alignment

• Two approaches– Hashing based algorithms

• BFAST• SHRIMP• MAQ• STAMPY (statistical alignment)

– Burrows Wheeler transform• Bowtie• BWA

BFAST overview

PLoS ONE 4(11): e7767.

BFAST algorithmPLoS ONE 4(11): e7767.

BFAST masked keys

Short read alignment

Empirical performance:• Simulated data:

– Extract random substrings of fixed length with random mutations and gaps

– Realign back to reference genome

• Real data: – Paired reads: two ends of the same molecule– Count number of paired reads within 500 to 10000

bases of each other

Short read alignment

Courtesy of Genome Res. June 2011 21: 936-939;

Short read alignment

Courtesy of Genome Res. June 2011 21: 936-939;

Short read alignment