Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie...

24
Smith-Waterman vs Smith-Waterman vs Blast in siRNA Blast in siRNA Oligonucleotide Design Oligonucleotide Design and Selection and Selection Christine Lee Christine Lee Dr. Cecilie Boysen, Ph.D. Dr. Cecilie Boysen, Ph.D. Paracel, Paracel, Applied High Performance Computing Applied High Performance Computing Southern California Bioinformatics Southern California Bioinformatics Institute Institute Summer 2004 Summer 2004 Funded by the National Science Foundation and National Funded by the National Science Foundation and National
  • date post

    23-Jan-2016
  • Category

    Documents

  • view

    220
  • download

    0

Transcript of Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie...

Page 1: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

Smith-Waterman vs Blast Smith-Waterman vs Blast in siRNA Oligonucleotide in siRNA Oligonucleotide

Design and SelectionDesign and Selection

Christine LeeChristine Lee

Dr. Cecilie Boysen, Ph.D.Dr. Cecilie Boysen, Ph.D.Paracel, Paracel, Applied High Performance ComputingApplied High Performance Computing

Southern California Bioinformatics InstituteSouthern California Bioinformatics Institute

Summer 2004Summer 2004Funded by the National Science Foundation and National Institute of Funded by the National Science Foundation and National Institute of

HealthHealth

Page 2: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

OutlineOutline

History of RNAiHistory of RNAi Small interfering RNA (siRNA) MechanismSmall interfering RNA (siRNA) Mechanism siRNA design and selectionsiRNA design and selection Blast vs Smith-Waterman Blast vs Smith-Waterman Project Objectives and ResultsProject Objectives and Results Conclusions & Future WorkConclusions & Future Work

Page 3: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

History of RNAiHistory of RNAi

Discovered in 1998 by Andrew Fire, Discovered in 1998 by Andrew Fire, Craig Mello, and colleagues Craig Mello, and colleagues

RNAi – silencing of gene expression RNAi – silencing of gene expression by dsRNA moleculesby dsRNA molecules

Organism used: Caenorhabditis Organism used: Caenorhabditis eleganselegans

Page 4: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

Short interfering RNA (siRNA) Short interfering RNA (siRNA) MechanismMechanism

http://www.bioteach.ubc.ca/MolecularBiology/AntisenseRNA/siRNA.gif

Page 5: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

siRNA Selection & Design: siRNA Selection & Design: Avoiding Cross-HybridizationAvoiding Cross-Hybridization

Important to guard against strong cross-Important to guard against strong cross-hybridization to other geneshybridization to other genes

Cross-hybridization with non-specific Cross-hybridization with non-specific targets results in wasted lab time and targets results in wasted lab time and materials, as well as inaccurate materials, as well as inaccurate conclusionsconclusions

Preliminary sequence analysis allows Preliminary sequence analysis allows verification of candidate oligos to protect verification of candidate oligos to protect against cross-hybridizationagainst cross-hybridization

Page 6: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

siRNA Selection & DesignsiRNA Selection & Design Hybridization concerns: Hybridization concerns:

siRNA mismatch tolerancesiRNA mismatch tolerance Insertion/deletion vs mismatchInsertion/deletion vs mismatch

Query: 1 GAACTTATCTTCCTTCTTC 19 Query: 1 GAACTTATCTTCCTTCTTC 19 ||||||||||||||||||||||||||||||||||||||Sbjct: 3783 GAACTTATCTTCCTTCTTC 3801Sbjct: 3783 GAACTTATCTTCCTTCTTC 3801

Query: 19 GAAGAAGGAAGATAAGTTC 1 ||||||||| || ||||||Sbjct: 778 GAAGAAGGATGAGAAGTTC 796

Page 7: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

Blast vs Smith-WatermanBlast vs Smith-Waterman Blast may potentially miss relevant Blast may potentially miss relevant

alignmentsalignments Using word size seven, nearly 6% of all Using word size seven, nearly 6% of all

possible alignments with three possible alignments with three mismatches between 21-mers will be mismatches between 21-mers will be missedmissed

Increasing word size or allowing more Increasing word size or allowing more mismatches contribute to higher rate of mismatches contribute to higher rate of missed hits missed hits

Smith-Waterman is said to have higher Smith-Waterman is said to have higher sensitivity, so why not use it?sensitivity, so why not use it?

Page 8: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.
Page 9: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

Project ObjectivesProject Objectives

Test set: 10,000 19-mer oligos/siRNAsTest set: 10,000 19-mer oligos/siRNAs Test database: RefSeqTest database: RefSeq Comparison study between Blast and Comparison study between Blast and

Smith WatermanSmith Waterman 15/19 -> Percent Identity threshold 15/19 -> Percent Identity threshold

set to 78% … e-value adjustment set to 78% … e-value adjustment from default of 10. E-value 500 usedfrom default of 10. E-value 500 used

Page 10: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

A Closer Look at Smith-A Closer Look at Smith-Waterman & Blast ParametersWaterman & Blast Parameters

AlgorithmAlgorithm AlignmentAlignment Score/Score/(ID)(ID)

ParamParam MatchMatch Mis-Mis-matchmatch

GO/GO/

GEGE

Gap Gap TotalTotal

Smith Smith WatermanWaterman

Query: 19 TCACCGTAGATGCTCTTTC 1Query: 19 TCACCGTAGATGCTCTTTC 1

|| |||| ||||||||||||| |||| |||||||||||

Sbjct: 2376 TC-CCGTGGATGCTCTTTC 2393Sbjct: 2376 TC-CCGTGGATGCTCTTTC 2393

2929

17/19 17/19 (89%)(89%)

defaultdefault +2+2 -2-2 -3-3

BlastBlast Query: 1 gaaagagcatctacgg 16Query: 1 gaaagagcatctacgg 16

||||||||||| ||||||||||||||| ||||Sbjct: 2393 gaaagagcatccacgg 2378Sbjct: 2393 gaaagagcatccacgg 2378

1212

15/16 15/16 (93%)(93%)

W 7W 7

e 500e 500

DefaultDefault

+1+1 -3-3 G -5G -5

E -2E -2

-7-7

BlastBlast Query: 1 gaaagagcatctacggtga 19Query: 1 gaaagagcatctacggtga 19

||||||||||| |||| ||||||||||||| |||| ||

Sbjct: 2393 gaaagagcatccacgg-ga 2376Sbjct: 2393 gaaagagcatccacgg-ga 2376

2929

17/19 17/19 (89%)(89%)

W7W7

e 500e 500

G 1 G 1

q 2 r 2q 2 r 2

+2+2 -2-2 G -1G -1

E -2E -2

-3-3

Page 11: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

Smith-Waterman vs. Blast ResultsSmith-Waterman vs. Blast Results

Percent Identity: 89% ,GATA3 gene

>gi|4503928|ref|NM_002051.1| Homo sapiens GATA binding protein 3

(GATA3), mRNA

Length = 2365

Score = 31.7 bits (38), Expect = 0.041

Identities = 19/19 (100%)

Strand = Plus / Plus

Query: 1 ctttttaacatcgacggtc 19

|||||||||||||||||||

Sbjct: 299 ctttttaacatcgacggtc 317

SWN hit-4 bin Blast hit-1 bin W7 G1 r2 q2 e500 E2

Original Query Sequence: CTTTTTAACATCGACGGTC

Page 12: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

Smith-Waterman vs. Blast ResultsSmith-Waterman vs. Blast Results

>gi|4503928|ref|NM_002051.1| Homo sapiens GATA binding protein 3

(GATA3), mRNA

Length = 2365

Score = 31.7 bits (38), Expect = 0.041

Identities = 19/19 (100%)

Strand = Plus / Plus

Query: 1 ctttttaacatcgacggtc 19

|||||||||||||||||||

Sbjct: 299 ctttttaacatcgacggtc 317

>gi|4557424|ref|NM_001248.1| Homo sapiens ectonucleoside triphosphate

diphosphohydrolase 3 (ENTPD3), mRNA

Length = 2797

Score = 24.6 bits (29), Expect = 5.7

Identities = 17/19 (89%), Gaps = 1/19 (5%)

Strand = Plus / Minus

Query: 1 aa-aatactgagagaggga 18

|| ||||||||| ||||||

Sbjct: 2044 aagaatactgagggaggga 2026

>gi|4503928|ref|NM_002051.1| Homo sapiens GATA binding protein 3

(GATA3), mRNA

Length = 2365

Score = 31.7 bits (38), Expect = 0.041

Identities = 19/19 (100%)

Strand = Plus / Plus

Query: 1 ctttttaacatcgacggtc 19

|||||||||||||||||||

Sbjct: 299 ctttttaacatcgacggtc 317

SWN hit-1 binBlast hit-4 bin

Original Query Sequence: AAAATACTGAGAGAGGGAG

Page 13: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

Conclusions and Future Conclusions and Future WorkWork

Produce more conclusive statistics for occurrences Produce more conclusive statistics for occurrences of more accurate Smith-Waterman results of more accurate Smith-Waterman results

No consensus exists as to which hits are No consensus exists as to which hits are considered dangerous or significant for cross-considered dangerous or significant for cross-hybridizationhybridization

Creation of a position-specific matrixCreation of a position-specific matrix Mutation tolerance on the 5’ endMutation tolerance on the 5’ end Low tolerance on the 3’ endLow tolerance on the 3’ end GU wobbleGU wobble

Page 14: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

ReferencesReferences Novina, C and Sharp, P. Novina, C and Sharp, P. The RNAi revolution.The RNAi revolution.

Nature. 2004 Jul 8;430(6996):161-4.Nature. 2004 Jul 8;430(6996):161-4. Dorsett, Y and Tuschl, T. Dorsett, Y and Tuschl, T. siRNAs: applications siRNAs: applications

in functional genomics and potential as in functional genomics and potential as therapeutics.therapeutics. Nat Rev Drug Discov. 2004 Nat Rev Drug Discov. 2004 Apr;3(4):318-29. Apr;3(4):318-29.

Snove, O Jr. and Holen, T. Snove, O Jr. and Holen, T. Many commonly Many commonly used siRNAs risk off-target activity.used siRNAs risk off-target activity. Biochem Biochem Biophys Res Commun. 2004 Jun Biophys Res Commun. 2004 Jun 18;319(1):256-63. 18;319(1):256-63.

Paroo, Z and Corey, DR. Paroo, Z and Corey, DR. Challenges for RNAi Challenges for RNAi in vivo.in vivo. Trends Biotechnol. 2004 Trends Biotechnol. 2004 Aug;22(8):390-4.Aug;22(8):390-4.

Amarzguioui, M. et al. Amarzguioui, M. et al. Tolerance for mutations Tolerance for mutations and chemical modifications in siRNAand chemical modifications in siRNA. Nucl . Nucl Acids Research. 2003; 31(2)589-595. Acids Research. 2003; 31(2)589-595.

Page 15: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

AcknowledgementsAcknowledgements Dr. Cecilie Boysen Dr. Cecilie Boysen (advisor) Parcel Scientific Staff(advisor) Parcel Scientific Staff

David Meyer David Meyer Paracel Software EngineerParacel Software Engineer

Stephanie PaoStephanie Pao Paracel Technical Sales Engineer Paracel Technical Sales Engineer

Frances Tong Frances Tong Paracel InternParacel Intern

William White William White Paracel Technical WriterParacel Technical Writer

Southern California Bioinformatics Institute 2004 Southern California Bioinformatics Institute 2004 Faculty and Staff: Faculty and Staff: Dr. Jamil Momand, Dr. Nancy Warter-Perez, Dr. Jamil Momand, Dr. Nancy Warter-Perez, Dr. Sandra Sharp & Dr. Wendie Johnston,Dr. Sandra Sharp & Dr. Wendie Johnston,& Jackie Leung & Jackie Leung

Fellow interns Fellow interns NIH & NSFNIH & NSF

Page 16: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

Short interfering RNA Mechanism

Post-transcriptional gene silencing.

Novina, C and Sharp, P. The RNAi revolution. Nature Vol 430. July 8, 2004.

Page 17: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

Dorsett, Y and Tuschl, T. siRNAs: applications in functional genomics and potential as therapeutics. Nat Rev Drug Discov. 2004 Apr;3(4):318-29.

•Reverse genetic approaches – expensive and time consuming

•siRNA may be chemically synthesized or expressed from DNA vectors

Page 18: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

MicroRNAsMicroRNAs

Short RNAs 19-25 nucleotidesShort RNAs 19-25 nucleotides Abundant, single stranded RNAs encoded in Abundant, single stranded RNAs encoded in

genomes of most multicellular organisms: from genomes of most multicellular organisms: from few thousand to 40,000 molecules per cellfew thousand to 40,000 molecules per cell

Some evolutionarily conserved and Some evolutionarily conserved and developmentally regulateddevelopmentally regulated

Translational silencing.

Picture from: Novina, C and Sharp, P. The RNAi revolution. Nature Vol 430. July 8, 2004.

Page 19: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

Differences between siRNA and Differences between siRNA and miRNAmiRNA

siRNAsiRNA Promote the cleavage Promote the cleavage

or degradation of or degradation of mRNAsmRNAs

Sense strand has Sense strand has “exactly the same “exactly the same sequence as the sequence as the target strand” target strand”

Target genes or Target genes or genetic elements genetic elements from which they from which they originatedoriginated

miRNAmiRNA Regulate the Regulate the

expression of expression of mRNAs; transcription mRNAs; transcription is not impeded and is not impeded and mRNAs not destroyedmRNAs not destroyed

Imperfect base-Imperfect base-pairing between pairing between mRNA targets and mRNA targets and miRNAmiRNA

Regulate separate Regulate separate genesgenes

Page 20: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

Interchangeability of siRNAs Interchangeability of siRNAs and miRNAsand miRNAs

miRNA may act like siRNAmiRNA may act like siRNA

* perfect or near-perfect complementarity * perfect or near-perfect complementarity to cellular mRNAsto cellular mRNAs

Could siRNA also work like miRNA?Could siRNA also work like miRNA?

* synthetic siRNA partially complementary * synthetic siRNA partially complementary to ‘reporter’ gene inhibited its expressionto ‘reporter’ gene inhibited its expression

Distinction between single site with almost Distinction between single site with almost exact complementarity and numerous exact complementarity and numerous partially complementary binding sitespartially complementary binding sites

Page 21: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

Laboratory and Clinical Laboratory and Clinical Applications of siRNAApplications of siRNA

In C. elegans, simple experiment: inject In C. elegans, simple experiment: inject dsRNA, soak in dsRNA solution, or feed with dsRNA, soak in dsRNA solution, or feed with bacteria expressing dsRNAbacteria expressing dsRNA

In worms, screening for obesity and ageingIn worms, screening for obesity and ageing In fruitflies, purified long dsRNA used to In fruitflies, purified long dsRNA used to

identify roles of genes in cholesterol identify roles of genes in cholesterol metabolism and heart formationmetabolism and heart formation Therapeutic potential of siRNAs for humansTherapeutic potential of siRNAs for humans

Page 22: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

FileFile TypeType BasesBases Sequences Sequences # of Oligos# of Oligos

BRCA1 BRCA1 fastafasta 32433243 11 32553255

GATA3 GATA3 fastafasta 30703070 11 30703070

HLA-molecule HLA-molecule fastafasta 29182918 11 29182918

Insulin-like-Insulin-like-growth-factor growth-factor

fastafasta 49894989 11 49714971

Interleukin-Interleukin-receptor receptor

fastafasta 14511451 11 14331433

NFKB1 NFKB1 fastafasta 41044104 11 41864186

Serine kinase Serine kinase fastafasta 35063506 11 34883488

Serotonin-Serotonin-receptor receptor

fastafasta 19271927 11 19091909

TNF2 TNF2 fastafasta 16691669 11 16511651

Vinculin Vinculin fastafasta 56475647 11 56295629

Total Total 3255432554 1010

Page 23: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

Paroo, Z and Corey, DR. Challenges for RNAi in vivo. Trends Biotechnol. 2004 Aug;22(8):390-4.

Page 24: Smith-Waterman vs Blast in siRNA Oligonucleotide Design and Selection Christine Lee Dr. Cecilie Boysen, Ph.D. Paracel, Applied High Performance Computing.

Blast vs Smith-Waterman Blast vs Smith-Waterman Speed Test ResultsSpeed Test Results

11.35

205.69

46.7

346.08

0

50

100

150

200

250

300

350

Default e500

SWNBlast

Time in minutes