Hal.05
-
Upload
tate-dennis -
Category
Documents
-
view
18 -
download
1
description
Transcript of Hal.05
RECOMB 2005, Poster Session A, Bay 43
Integrated Design Flow for Universal DNA Tag ArraysN. Hundewale1, I. Mandoiu2, L. Perelygina3, C. Prajescu2, and A. Zelikovsky1
1CS Department, GSU, 2CSE Department, UCONN, 3Department of Biology, GSU
• DNA microarrays provide a tool for answering a wide variety of questions about the dynamics of cells– In which cell tissues and under what environmental conditions is each
gene active?– How does the activity level of a gene change with: cell cycle stage,
environmental conditions, disease, etc.?– What genes seem to be regulated together?
• Universal tag arrays (UTAs) technology – Provides unprecedented assay customization flexibility while
maintaining a high degree of multiplexing and low unit cost
• In this poster we describe an integrated design flow for genomic assays based on UTAs– We use the proposed flow to design UTA-based assays for
measuring Herpes B viral gene expression in cells derived from macaque and human hosts
– After defining a “B virus molecular signature”, the assay can provide a sensitive tool for early B virus infection diagnosis and differentiation between B herpes and the closely related herpes simplex viruses
Abstract Universal DNA Tag Arrays
• “Programmable” Array Format [Brenner 97, Morris et al. 98]– Array consists of application independent oligonucleotides called tags
– Two-part reporter probes: aplication specific primers ligated to antitags
– Detection carried by a sequence of reactions separately involving the primer and the antitag part of reporter probes
• Tag/Antitag Hybridization Constraints(H1) Antitags hybridize strongly to complementary tags
(H2) No antitag hybridezes to a non-complementary tag
(H3) Antitags do not cross-hybridize to each other
t1t1 t2t2 t1 t2t1
+
Mix reporter probes with genomic DNASolution phase hybridization
Solid phase hybridization
Single-Base Extension
Generic UTA-Based Assay
Bioperl
Sequences in FASTA format
ORFs in Fasta format
GenMark/ORF Finder
Probe pools
Promide
Tag/antitag sequences
PerTags
Genomic IDs
Assayparameters
Reporter probes
PrimerDel+
Hybridization Experiment and AnalysisHybridization Experiment and Analysis
Design Flow Tag Set Design
Cycle Packing Algorithm [Mandoiu&Trinca 05]• T{}1. For each cycle C in c-token factor graph G, in increasing
order of cycle length, do– If C has no c-tokens in common with T, then add tag
defined by C to T and remove C from G2. Return T
Find: maximum cardinality set of tags such that no tag/tag or tag/antitag pair shares a substring of weight c
Where: weight(A)=weight(T)=1, weight(C)=weight(G)=2, and c is a given hybridization stringency constant
Conservative formalization of (H1)-(H3) based on nucleation complex theory and 2-4 rule:
Tag AssignmentPrimer-to-tag hybridization constraints:If primer p hybridizes with tag t, then either p or t must be left un-assigned, unless p is assigned to t p
t
t’
p’
Maximum Assignable Primer Set Problem: given primer set P and tag set T, find a maximum size assignable subset of P
• Greedy primer deletion heuristic [Ben-Dor 04] • Repeatedly delete a primer of maximum weight until P becomes
assignable, where– Weight of p is sum of potentials of tags to which it hybridizes
– Potential of a tag hybridizing with k primers is 2-k
• PrimerDel+ [Mandoiu et al. 05] – Modified primer deletion heuristic (exploiting availability of several
primer candidates with equivalent functionality
Experimental Results
% Util.# arrays% Util.# arrays% Util.# arrays
76.10199.80297.8045
76.10198.90296.7341152270
78.00199.90298.0045
78.00198.70296.5341156067
72.301100.00296.1345
72.30197.20294.0641144660
2000 tags1000 tags500 tagsPool size
# poolsTm
% Util.# arrays% Util.# arrays% Util.# arrays
70.30291.10292.2645
65.40273.65388.4641152270
67.20276.00391.8645
61.15269.70386.3341156067
63.55270.95388.2645
57.05265.35382.2641144660
2000 tags1000 tags500 tagsPool size
# poolsTm
GenFlex Tags
Periodic Tags
• We have described a suite of software tools for designing genomic assays based on UTAs– Integrating design flow optimization steps yields higher multiplexing
rates and leads to reduced assay costs
• In future work we will make the entire software suite available as an online web server
References• Aymetrix, Inc., GeneFlex tag array probe set, available at the NetAffx™ Analysis Center,
http://www.affymetrix.com/analysis/• M. Atlas, N. Hundewale, L. Perelygina, and A. Zelikovsky, Proc. International Conf. of the IEEE
Engineering in Medicine and Biology (EMBC), pp. 172-175, 2004.• A. BenDor, T. Hartman, B. Schwikowski, R. Sharan, and Z. Yakhini. Towards optimally multiplexed
applications of universal DNA tag systems. Proc. 7th Annual International Conference on Research in Computational Molecular Biology (RECOMB), pp. 48-56, 2003
• S. Brenner. Methods for sorting polynucleotides using oligonucleotide tags. US Patent 5,604,097, 1997.• I.I. Mandoiu and D. Trinca. Exact and approximation algorithms for DNA tag set design. Proc. 16th Annual
Symposium on Combinatorial Pattern Matching (CPM), pp. 383-393, 2005. • I.I. Mandoiu, C. Prajescu, and D. Trinca. Improved tag set design and multiplexing algorithms for universal
arrays. Proc. 5th Int. Conf. on Computational Science (ICCS 2005), Part II, pp. 994-1002, 2005.• M. Borodovsky, Genemark, http://opal.biology.gatech.edu/GeneMark• ORF finder, http://www.ncbi.nih.gov/gorf/gorf.html.• S. Rahmann, Rapid large-scale oligonucleotide selection for microarrays, Proc. IEEE Computer Society
Bioinformatics Conference (CSB), 2002.
Conclusions
• Open reading frames (ORFs)– ORFs are regions of genetic material beginning with a start codon and ending with a stop codon that might code for a protein
– ORFs can be extracted by means of the genome's sequence or id using ORF Finder. A second approach is to use the GenMark family of statistical gene prediction programs [Borodovsky]
•Primer selection
-Constraints:-Homogeneity: Each primer must hybridize to its target site at the temperature selected for the experiment
-Sensitivity: Must avoid self-hybridization and ensure that primers do not form secondary structures
-Specificity: Each primer must hybridize to one particular ORF-Selection tools:
-Primer and microarray probe selection are well studied; we use the Promide tool [Rahmann 03] for selecting pools of primer candidates meeting the above constraints for each ORF
ORF and Primer Selection