Real-Time Primer Design for DNA Chips

17
Real-Time Primer Design for DNA Chips Annie Hui CMSC 838 Presentation

description

Real-Time Primer Design for DNA Chips. Annie Hui CMSC 838 Presentation. Use of primers in PCR and Microarrays. PCR (polymerase chain reaction: to amplify a particular DNA fragment Use: to test for the presence of nucleotide sequences. Test of PCR products:. - PowerPoint PPT Presentation

Transcript of Real-Time Primer Design for DNA Chips

Page 1: Real-Time Primer Design for DNA Chips

Real-Time Primer Design for DNA Chips

Annie HuiCMSC 838 Presentation

       

   

Page 2: Real-Time Primer Design for DNA Chips

CMSC 838T – Presentation

Use of primers in PCR and Microarrays PCR (polymerase chain reaction:

to amplify a particular DNA fragment Use: to test for the presence of nucleotide sequences

Ladder: a mixture of fragments of known length Lane 1 : PCR fragment is ~1850 bases long. Lane 2 and 4 : the fragments are ~ 800 bases long. Lane 3 : no product is formed, so the PCR failed. Lane 5 : multiple bands are formed because one of

the primers fits on different places.

Test of PCR products:

Page 3: Real-Time Primer Design for DNA Chips

CMSC 838T – Presentation

Use of primers in PCR and Microarrays

DNA chips (Microarrays): to analyse a large number of genes in parallel.

Primers: 20 to 100 bases long Synthetically manufactured

Automated design of primer A computational approach Objective: To find primers that bind

well without self-hybridizing Critique: how accurate?

Fixed on chip

fluorescence

Bound to primer

Page 4: Real-Time Primer Design for DNA Chips

CMSC 838T – Presentation

Motivation:

This group uses the automated NucliSens extraction system (bioMerieux) to develop their primers here.

Page 5: Real-Time Primer Design for DNA Chips

CMSC 838T – Presentation

1. Select primers from target sequence two primers P (forward) and Q (reverse) for PCR, one primer

for DNA chip (microarray)

Using window size W, number of possible primers with length between m and n within 1 window is:

Technique: The computational model

n

mllWS 1)(

Page 6: Real-Time Primer Design for DNA Chips

CMSC 838T – Presentation

Technique: The computational model

2. For each primer pair, or single primer,

Quantify 4 hybridization conditions:a. Primer length

b. Melting temperature

c. GC content

d. Secondary structurei. Self annealing

ii. Self end annealing

iii. Pair annealing

iv. Pair end annealing

We are starting here

Page 7: Real-Time Primer Design for DNA Chips

CMSC 838T – Presentation

Technique: quantifying hybridization conditions

a. Primer length len(P) Affect melting temperature and hybridization

b. Melting temperature Tm(P) Temperature at which the bonds between primer

and gene sequence break

c. CG content CG(P) G-C pairs are more stable than A-T pairs

(because of more H-bonds)

# in # in 100

G P C PGC p

p

,1 0

4

9

0

ln

1.987 /

50 10

237.15

21.6

m

H pT p T t

S p R

p primer

R cal C mol

T C

t C

H p enthalpy

S p entropy

What is this measure good for?

1

11

1

11

,

,

n

i ii

n

i ii

H p H p p

S p S p p

Page 8: Real-Time Primer Design for DNA Chips

CMSC 838T – Presentation

Technique: quantifying hybridization conditions

d. Secondary structure Study how likely a primer entangles with itself or with another

primer

P = {p1, p2, …, pn}, Q = {q1, q2, …, qm},

Scoring function: S(pi, qj) = 2 if {pi, qj} = {A, T}

= 4 if {pi, qj} = {C, G}

= 0 otherwise

Example:

P: ...AGCTTTAGCCATAG

Q: TCTTAGGATCGC...

score S(pi, q1) = 2+4+2+2+4 = 14

Position i of primer P

Page 9: Real-Time Primer Design for DNA Chips

CMSC 838T – Presentation

Technique: quantifying hybridization conditions

Four measures of secondary structure:i. Self annealing, SA(P, P’)

• P’ = reverse of P P

P’P’P’P’P’P’P’

ii. Self end annealing, SEA(P, P’)• Like Self annealing• k>=0• Only count longest continuous overlaps

P

P’P’P’P’

iii. Pair annealing, PA(P, Q)• P and Q are the forward and reverse primers

iv. Pair end annealing, PEA(P, Q)• similar to self end annealing

m

ikii

mmk

ppsppSA11,...,1

)',()',( max

Page 10: Real-Time Primer Design for DNA Chips

CMSC 838T – Presentation

For PCR:

P is forward primer, Q is reverse primer Ideally, no annealing, length, GC and temp of P equals Q

The optimization is:

For DNA chips (Microarrays): Q doesn’t exist. No pair annealing to study. Only 5 terms left.

Technique: How to apply the model

, ,0 0 0 0 0 0

0.5 1 1 0.1 0.2 0.5 1 1 0.1 0.2 0.1 0.2

ideal p p m p p p m pSCPCR p len GC T len GC T

w

min

( , )

PCRp

TPCR ideal

l p

where

l p SCPCR p q SCPCR p w

]),(),()()()()()(

)()()()()([),(

qpPEAqpPAqSEAqSAqTqGCqlen

pSEApSApTpGCplenqpSCPCR

m

m

Page 11: Real-Time Primer Design for DNA Chips

CMSC 838T – Presentation

Technique: parallelize SCPCR(p,q) calculation

Calculate Len, GC, Temp, SA and SEA in parallel

Compute PA and PEA in parallel

Page 12: Real-Time Primer Design for DNA Chips

CMSC 838T – Presentation

Melting temperature and CG content: Simple adder+divider Use pipelining 1st one: O(m) Subsequent cost: O(1)

Annealing matrix

Technique: details

adbd

cda

bc

de

f

cebe

aeaf

bfcf

Whole window: AGCGATATAi-th P primer: GCGATA(i+I)-th P primer: CGATAT

• CG(Pi+1) = CG(Pi) - 1• H(Pi+1) = H(Pi) - H(GC) + H(AT), • similar for S

Page 13: Real-Time Primer Design for DNA Chips

CMSC 838T – Presentation

Complexity for sequential algorithm: For PCR:

Number of choices of P (window size=Wp):

Number of choices of Q (window size=Wq): Each distance SCPCR(P,Q): Total:

Complexity for parallel algorithm: For PCR:

Distance measure SCPCR(P, Q) = O(1) Total: O(S*T)

Similar but simpler for Microarray

Complexity

p

p

n

ml p lWS 1)(

q

q

n

ml q lWT 1)(

qpqp llllO 22

qpqp WWWWTSO 22

O(S*S*T*T) is a typo in the paper

Page 14: Real-Time Primer Design for DNA Chips

CMSC 838T – Presentation

Evaluation

Experimental environment 512 primer pairs, |Wp| = |Wq| = 16

1. 500MHz Celeron system with integrated hardware accelerator

2. Software implementation

Evaluation results 1920 secs for software implementation 3.41 secs for using hardware accelerator

Page 15: Real-Time Primer Design for DNA Chips

CMSC 838T – Presentation

Related Work

Previous approach DOPRIMER

Same computational model Differ in the way of doing dynamic programming Sequential in nature

Other Primer selection softwares Eg: Primer Premier 5, Primer3, PrimerGen, PrimerDesign Similarities:

Criteria: Length, Temp range, GC range, GC Clamp, 3’ end stability, uniqueness of 3’ end base, Dimer/hairpins, Degeneracy, Salt concentration, Annealing Oligo Concentration, etc

Differences: Not a weighed linear sum of all criteria Need much expert’s supervision, the numerical criteria are used as a guide only

Page 16: Real-Time Primer Design for DNA Chips

CMSC 838T – Presentation

More Related Works

Case study Burpo did a critical review of PCR primer design algorithms

Subject: saccharomyces cerevisiae deletion strains Conclusion:

no suitable program for the task of post-design PCR analysis Especially in the aspect of accurately predicting non-specific

hybridization events that impair PCR amplification.

Page 17: Real-Time Primer Design for DNA Chips

CMSC 838T – Presentation

Observations

My observations: Minus side:

Is the computational model too simplistic? Specifically, is a weighed linear sum justified?

Plus side: The design of the parallel architecture is neat. Since primers are about the length of 18-22 bases, current

technology certainly can handle it. When would you need fast primer selection?

Primer walking to connect contigs together quickly To scan through a large number of sequences for possible

primers