Pooling data across Micorarray
-
Upload
sheenascroggin3619 -
Category
Documents
-
view
218 -
download
0
Transcript of Pooling data across Micorarray
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 1/50
Pooling Data Across Microarray
Experiments Using Different Versions of AffymetrixOligonucleotide Arrays
Jeffrey S. Morris
Li Zhang, Chunlei Wu, Keith
Baggerly and Kevin CoombesUT MD Anderson Cancer Center
Houston, TX, USA
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 2/50
Combining Information across
Microarray Studies
Many publicly available microarray data sets
Can combine information across studies to:1. Validate results from individual studies Find intersection of differentially expressed genes Build model using one study, validate using another
2. Discover new biological insights by analysespooling data across studies.
Potential for increased statistical power Important since many individual studies are
underpowered.
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 3/50
Pooling Data across Studies Challenge: In general, microarray data
from different studies not comparable Clinical differences
Different study populations
Technical differences Laboratory differences: sample collection and
storage, microarray protocol Different platforms: cDNA/oligo, different
versions of same technology (e.g. Affy chips)
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 4/50
Pooling Data across Studies Approaches in Existing Literature:1. Include study effects in model
Gene-specific study effects SVD, Distance-weighted DiscriminationDrawback: First-order corrections not enough
2. Model unitless summary measures
standardized log fold-change t-statistics probabilities of +/0/- expressionDrawback: Implicit assumptions about comparability
of clinical populations across studies
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 5/50
Pooling Data across Studies Sequence-related reasons for incomparability
of raw expression levels across platforms:
Cross-hybridization RNA degradation (near 5 end) Probe validity map to RefSeq? Alternative splicing
It may be possible that, by taking these into
account, we can obtain more comparable rawexpression levels to use in pooled analyses
Our focus: combining information acrossdifferent versions of Affymetrix genechips
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 6/50
Overview of Affymetrix
GeneChips Probes: 25-base sequences from gene
of interest Probesets: set of probes corresponding
to same gene. Obtained from current sequence information
in GenBank, Unigene, RefSeq
Generations of human chips: HuGeneFL: 5600 genes, 20 probes/gene U95Av2: 10,000 genes, 16 probes/gene U133A: 14,500 genes, 11 probes/gene
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 7/50
Example: CAMDA Lung Cancer
Data CAMDA: Critical Assessment of
Microarray Data Analysis: annualconference at Duke University
CAMDA 2003: Two studies relating geneexpression data to survival in lung cancerpatients
1. Harvard (Bhattacharjee, et al. 2001) 124 lung adenocarcinoma samples
2. Michigan (Beer, et al. 2002) 86 lung adenocarcinoma samples
GOAL: Pool data across studies to identify
prognostic genes for lung cancer.
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 8/50
Pooling Information across Studies
Harvardpatients worseprognosis
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 9/50
Pooling Information across Chip
Types
Michigan: HuGeneFl Chip
6,633 probe sets -- 20 probe pairs each
Harvard: HG_U95Av2 Chip
12,453 probe sets 16 probe pairs each
Problems:
Different genes
Incomparable expression levels
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 10/50
Example: CAMDA 2003 Data Two studies used different chip types:
Michigan: HuGeneFL
~130,000 probes in 6,633 probesets Harvard: U95Av2
~200,000 probes in 12,625 probesets
34,428 matching probes combined into 4,101partial probesets
After preprocessing, 1036 probesets consideredin subsequent analyses
We used PDNN (Zhang et al. 2003 NatureBiotech) method for quantification
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 11/50
Partial Probeset Method
HuGeneFL :HG_U95Av2:
Matching ProbesPartial Probesets
1. Identify matching probes
2. Recombine into new probesets based on UNIGENE
clusters, which we refer to as partial probesets3. Eliminate any probesets containing just one or two
probes
Note: Any quantification method can subsequently beused (MAS, dChip, RMA, PDNN)
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 12/50
Pooling Information across Chip
Types
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 13/50
Quantification of Expression Levels
Gene expressions quantified by applying LisPDNN model to our partial probesets
Uses probe sequence info to predict patterns of specific and nonspecific hybridization intensities
Allows borrowing of strength across probe sets
Model is not overparameterized O(N probesets)
See Zhang, et al. (2003) Nature Biotech forfurther details on method and comparison
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 14/50
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 15/50
Assessing Our Method for Combining
Information Across Chip Types
Partial
Probesetmethodappears to givecomparableexpressionlevels acrosschip types.
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 16/50
Assessing our Method for Combining
Information across Chip Types
Median partial probesetsize is 7, vs. 16 or 20
Loss of precision?
No evidence of significant precision loss
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 17/50
Assessing Partial Probeset Method
Agreement inrelative
quantificationsacross samples
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 18/50
Assessing Partial Probeset Method Agreement in
relative
quantificationsacross samples
Less variablegenes worse
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 19/50
Assessing Partial Probeset Method Agreement in
relative
quantificationsacross samples
Less variablegenes worse
Eliminate geneswith sd<0.20 orr<0.90
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 20/50
Assessing Partial Probeset Method Agreement in
relative
quantificationsacross samples
Less variablegenes worse
Eliminate geneswith sd<0.20 orr<0.90
1,036 genes
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 21/50
Identifying Prognostic Genes1. Preprocess raw microarray data
Outlier Detection, Normalization, Quantification,
Remove Some Genes? Left with n-by-p matrix of expression levels for
p genes on n microarrays.
2. Identify which genes are correlated to
outcome of interest Perform standard statistical test for each gene
obtain (permutation) p-values
Find cutpoint on p-values to declaresignificance that accounts for multiplicities.
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 22/50
Identifying Prognostic Genes After preprocessing:
1036 genes, 200 samples Identify genes related to survival
After adjusting for known clinical
predictors Provide prognostic information on survival
above and beyond clinical predictors
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 23/50
Identifying Prognostic Genes:
Cox Regression Modeling Hazard : P(t) ~ Prob(X<t +(t | X>t )
Cox Model: Pi(t) = P0(t) exp(X i F )
X i = Vector of covariates for subject i F = Vector of regression coefficients
Key Assumption: Proportional Hazards Hazard ratio between subjects with different
covariates does not vary over time.
Pi(t )/Pk(t ) = exp{ (X i-X k) F } Exp(F) =Change in hazard per unit change in X
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 24/50
Identifying Prognostic Genes:
Cox Regression Modeling Best Clinical Model:
Factor FExp(F) Z p
StudyMichigan = 0
Harvard = 1
0.67 1.95 2.73 0.0062
Age 0.03 1.03 2.60 0.0094
StageEarly (1-2) = 0
Late (3-4) = 1
1.53 4.61 6.61 <0.000000001
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 25/50
Identifying Prognostic Genes Series of 1036 multivariable Cox models fit to
identify prognostic genes. Each model contained: Study (Michigan=-1, Harvard=1). Age (continuous factor). Stage (early=0/late=1). Probeset (log intensity value as continuous factor).
Exact p-values for each probeset computed
using permutation approach By using multivariate modeling, we search for genes
offering prognostic information beyond clinicalpredictors
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 26/50
Identifying Prognostic Genes:
BUM Method No prognostic genes pvals Uniform
Prognostic genes
smaller pvals Fit Beta-Uniform mixture to histogram
of p-values BUM method
(Pounds and Morris, 2003 Bioinformatics)
Method can be used to identifyprognostic genes while controlling FDR
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 27/50
Results Histogram suggests
there are somesignificant probesets
FDR=0.20 correspondspval cutoff of 0.0024(BUM, Pounds andMorris 2003)
26 probesets flaggedas significant
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 28/50
Selected Flagged GenesRank Gene F p Function
1 FCGRT -2.07 <0.00001 Induced by IF-K in treating SCLC
2 ENO2 1.46 0.00001 Marker of NSCLC4 RRM1 1.81 0.00002 Linked to survival in NSCLC
8 CHKL -1.43 0.00010 Marker of NSCLC
11 CPE 0.72 0.00031 Marker of SCLC
12 ADRBK1 -2.20 0.00044 Co-expressed with Cox-2 in lung ADC
16 CLU -0.52 0.00109 Marker of SCLC
20 SEPW1 -1.29 0.00145 H202 cytotox. in NSCLC cell lines
21 FSCN1 0.66 0.00150 Marker of invasiveness in Stg 1 NSCLC
25 BTG2 -0.75 0.00232 Induced by p53 in SCLC cell lines
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 29/50
Selected Flagged GenesRank Gene F p Function
1 FCGRT -2.07 <0.00001 Induced by IF-K in treating SCLC
2 ENO2 1.46 0.00001 Marker of NSCLC
4 RRM1 1.81 0.00002 Linked to survival in NSCLC
8 CHKL -1.43 0.00010 Marker of NSCLC
11 CPE 0.72 0.00031 Marker of SCLC
12 ADRBK1 -2.20 0.00044Co-expressed with Cox-2 in lung ADC
16 CLU -0.52 0.00109 Marker of SCLC
20 SEPW1 -1.29 0.00145 H202 cytotox. in NSCLC cell lines
21 FSCN1 0.66 0.00150 Marker of invasiveness in Stg 1 NSCLC
25 BTG2 -0.75 0.00232 Induced by p53 in SCLC cell lines
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 30/50
Selected Flagged GenesRank Gene F p Function
1 FCGRT -2.07 <0.00001 Induced by IF-K in treating SCLC
2 ENO2 1.46 0.00001 Marker of NSCLC
4 RRM1 1.81 0.00002 Linked to survival in NSCLC
8 CHKL -1.43 0.00010 Marker of NSCLC
11 CPE 0.72 0.00031 Marker of SCLC
12 ADRBK1 -2.20 0.00044Co-expressed with Cox-2 in lung ADC
16 CLU -0.52 0.00109 Marker of SCLC
20 SEPW1 -1.29 0.00145 H202 cytotox. in NSCLC cell lines
21 FSCN1 0.66 0.00150 Marker of invasiveness in Stg 1 NSCLC
25 BTG2 -0.75 0.00232 Induced by p53 in SCLC cell lines
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 31/50
Selected Flagged GenesRank Gene F p pStage Function
1 FCGRT -2.07 <0.00001 0.154 Induced by IF-K in treating SCLC
2 ENO2 1.46 0.00001 0.282Marker of NSCLC
4 RRM1 1.81 0.00002 0.321 Linked to survival in NSCLC
8 CHKL -1.43 0.00010 0.979 Marker of NSCLC
11 CPE 0.72 0.00031 0.088 Marker of SCLC
12 ADRBK1 -2.20 0.00044 0.484 Co-expressed with Cox-2 in PUC
16 CLU -0.52 0.00109 0.014 Marker of SCLC
20 SEPW1 -1.29 0.00145 0.028 H202 cytotox. in NSCLC cell lines
21 FSCN1 0.66 0.00150 0.082 Marker of invasiveness in Stage 1NSCLC
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 32/50
Selected Flagged GenesRank Gene F p pStage Function
1 FCGRT -2.07 <0.00001 0.154 Induced by IF-K in treating SCLC
2 ENO2 1.46 0.00001 0.282Marker of NSCLC
4 RRM1 1.81 0.00002 0.321 Linked to survival in NSCLC
8 CHKL -1.43 0.00010 0.979 Marker of NSCLC
11 CPE 0.72 0.00031 0.088 Marker of SCLC
12 ADRBK1 -2.20 0.00044 0.484 Co-expressed with Cox-2 in PUC
16 CLU -0.52 0.00109 0.014 Marker of SCLC
20 SEPW1 -1.29 0.00145 0.028 H202 cytotox. in NSCLC cell lines
21 FSCN1 0.66 0.00150 0.082 Marker of invasiveness in Stage 1NSCLC
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 33/50
Selected Flagged GenesRank Gene F p pStage Function
1 FCGRT -2.07 <0.00001 0.154 Induced by IF-K in treating SCLC
2 ENO2 1.46 0.00001 0.282Marker of NSCLC
4 RRM1 1.81 0.00002 0.321 Linked to survival in NSCLC
8 CHKL -1.43 0.00010 0.979 Marker of NSCLC
11 CPE 0.72 0.00031 0.088 Marker of SCLC
12 ADRBK1 -2.20 0.00044 0.484 Co-expr with Cox-2 in lung adenocarc
16 CLU -0.52 0.00109 0.014 Marker of SCLC
20 SEPW1 -1.29 0.00145 0.028 H202 cytotox. in NSCLC cell lines
21 FSCN1 0.66 0.00150 0.082 Marker of invasiveness in Stage 1NSCLC
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 34/50
Selected Flagged GenesRank Gene F p pStage Function
1 FCGRT -2.07 <0.00001 0.154 Induced by IF-K in treating SCLC
2 ENO2 1.46 0.00001 0.282 Marker of NSCLC
4 RRM1 1.81 0.00002 0.321 Linked to survival in NSCLC
8 CHKL -1.43 0.00010 0.979 Marker of NSCLC
11 CPE 0.72 0.00031 0.088 Marker of SCLC
12 ADRBK1 -2.20 0.00044 0.484Co-expressed with Cox-2 in PUC
16 CLU -0.52 0.00109 0.014 Marker of SCLC
20 SEPW1 -1.29 0.00145 0.028 H202 cytotox. in NSCLC cell lines
21 FSCN1 0.66 0.00150 0.082 Marker of invasiveness in Stage 1NSCLC
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 35/50
Selected Flagged Genes
Rank Gene F p pStage Function
3 NFRKB -2.81 0.00001 0.058 Amplified in AML
7 ATIC 1.81 0.00009 0.771 Fusion partner of ALK which definessubtype of ALCL
13 BCL9 -1.64 0.00069 0.057 Over-expressed in ALL
15 TPS1 -0.64 0.00107 0.882 Associated with pulmonaryinflammation
25 BTG2 -0.75 0.00232 0.726 Inhibits cell proliferation in primarymouse embryo fibroblasts lackingfunctional p53
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 36/50
Selected Flagged Genes
Rank Gene F p pStage Function
3 NFRKB -2.81 0.00001 0.058 Amplified in AML
7 ATIC 1.81 0.00009 0.771 Fusion partner of ALK which definessubtype of ALCL
13 BCL9 -1.64 0.00069 0.057 Over-expressed in ALL
15 TPS1 -0.64 0.00107 0.882 Associated with pulmonaryinflammation
25 BTG2 -0.75 0.00232 0.726 Inhibits cell proliferation in primarymouse embryo fibroblasts lackingfunctional p53
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 37/50
Selected Flagged Genes
Rank Gene F p pStage Function
3 NFRKB -2.81 0.00001 0.058 Amplified in AML
7 ATIC 1.81 0.00009 0.771 Fusion partner of ALK which definessubtype of ALCL
13 BCL9 -1.64 0.00069 0.057 Over-expressed in ALL
15 TPS1 -0.64 0.00107 0.882 Associated with pulmonaryinflammation
25 BTG2 -0.75 0.00232 0.726 Inhibits cell proliferation in primarymouse embryo fibroblasts lackingfunctional p53
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 38/50
Results Our gene list has almost no overlap with other
publications of these data. Reasons:
1. We addressed a different research question Us: ID Genes offering prognostic info beyond clinical
Michigan: Univariate Cox models fit; results used toconstruct dichotomous risk index
Harvard: Cluster analysis done; clusters linked tosurvival; found genes driving the clustering
2. Pooling across studies yielded significantgains in statistical power.
Most genes (17/26) in our study are not flagged if we
analyze 2 data sets separately (i.e. no pooling)
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 39/50
CAMDA 2003 Results
We Won!
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 40/50
Limitations of Partial Probeset Method Worked well for combining across HuGeneFL/
U95Av2
~25% probes from HuGeneFL on U95Av2, with4,101 probesets
Not enough matching probes for use withU95Av2/U133A ~6% of probes from U95Av2 also on U133A, with
only 628 probesets
Requiring matching probes strong criterion,maybe weaker criterion would suffice?
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 41/50
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 42/50
Full-Length Transcript Based Probesets New probeset definition (FLTBP): probes match
the same set of full-length mRNA sequences
Procedure1. Construct comprehensive library of full-length mRNA
transcript sequences from RefSeq and HinvDB
2. For each probe, identify all matching full-length
transcripts using Blast programU95Av2: 15% matched no sequence, 33% matched multiple seq.
U133A: 18% matched no sequence, 38% matched multiple seq.
3. Group probes with same matched target lists (FLTBPs)U95Av2: 23,972 probesets, U133A: 14,148 probesets
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 43/50
Full-Length Transcript Based Probesets Matching across chip types:
9,642 FLTBPs match across U95Av2 and U133A
Affymetrix has their own method for mapping theirprobesets across arrays 9,480 pairs of probesets(only about ½ map the same way as FLTBPs)
Example: Lung cancer cell line data
28
cell lines, each hybridized onto both U95Av2 andU133A arrays.
Paired design suggests any differences between pairedmeasurements due to technical, not biological, sources.
Different quantification methods (PDNN, RMA, MAS, dChip)
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 44/50
Results Densityestimate of chip-to-chipcorrelations for
each gene Positive shift
for FLTBPsuggests bettercorrelations
Improvement greatest forPDNN
Correlation stillnot perfect
-0.5 0.0 0.5 1.0
0.0
0.5
1
.0
1.5
2.0
2.5
Gene-gene c orrelation
D
ensity
A
PDNN(P<0.00001)
-0.5 0.0 0.5 1.0
0.0
0.5
1
.0
1.5
2.0
2.5
Gene-gene c orrelat ion
D
ensity
B
RMA(P<0.00031)
-0 .5 0.0 0.5 1.0
0.0
0.5
1.0
1.5
2.0
2.5
Gene-gene c orrelation
D
ensity
C
MAS5(P<0.00575)
-0.5 0.0 0.5 1.0
0.0
0.5
1.0
1.5
2.0
2.5
Gene-gene c orrelat ion
D
ensity
D
dChip(P<0.00005)
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 45/50
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 46/50
Example: Sample Gene 2
4
5
6
7
8
9
10
1 6 11 16 21 26
Sample
Ln(probe sign
al)
4
5
6
7
8
9
10
1 6 11 16 21 26
Sample
Ln(probe sign
al)
11
12
13
14
15
7 8 9 10 11
U95Av2
U133A
R2
=0.87
R2
=0.25
Again, significantly higher correlation using FLTBP thanusing Affymetrix definition
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 47/50
Results Boxplot of
chip-to-chipcorrelations
(over genes)for eachsample
PDNN resulted
in highercorrelations
PDNN RMA MAS5 dChip
FLTBP
AffyPS
FLTBP
AffyPS
FLTBP
AffyPS
FLTBP
AffyPS
0.6
0.7
0.8
0.9
1.0
Sam
pl
la
ti
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 48/50
Conclusions New method for pooling info across studies
using different versions of Affymetrix chips. Recombine matched probes into new
probesets using Unigene clusters. Method appears to obtain comparable
expression levels across chips without sacrificingmuch precision or significantly altering therelative ordering of the samples.
Worked well combining information acrossHuGeneFL/U95Av2, but not U95Av2/U133A
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 49/50
Conclusions
Discussed new probeset definition based on
full-length transcript sequences. Removes effect of known alternative splicing Yields stronger between-chip correlations than
Affymetrix standard definitions
Pooling information across studies is difficult there is still more work to be done but worththe effort.
8/7/2019 Pooling data across Micorarray
http://slidepdf.com/reader/full/pooling-data-across-micorarray 50/50
ReferencesMorris JS, Yin G, Baggerly KA, Wu C, and Zhang L (2005).Pooling Information Across Different Studies and OligonucleotideMicroarray Chip Types to Identify Prognostic Genes for LungCancer. Methods of Microarray Data Analysis IV, eds. JSShoemaker and SM Lin, pp. 51-66, New York: Springer-Verlag.
Wu C, Morris JS, Baggerly KA, Coombes KR, Minna JD, andZhang L (2005). A probe-to-transcripts mapping method forcross-platform comparisons of microarray data taking into
account the effects of alternative splicing. Under review.
Morris JS, Wu C, Coombes KR, Baggerly KA, Wang J, and ZhangL (2006). Alternative Probeset Definitions for CombiningMicroarray Data Across Studies Using Different Versions of Affymetrix Oligonucleotide Arrays. To appear in Meta-Analysis in
Genetics edited by Rudy Guerra and David Allison Chapman