Application of available statistical tools Development of specific, more appropriate statistical...
-
Upload
stephan-hamblett -
Category
Documents
-
view
216 -
download
0
Transcript of Application of available statistical tools Development of specific, more appropriate statistical...
![Page 1: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/1.jpg)
Application of available statistical tools
Development of specific, more appropriate statistical tools for use with microarrays
Functional annotation of results
Inadequate Computer skills to handle large datasets
Intimacy with nature (strengths and deficiencies) of the raw data
Facile use of computer operating system is absent
Biological interpretation
Application of available statistical tools
Functional annotation of results
Inadequate Computer skills to handle large datasets
Intimacy with nature (strengths and deficiencies) of the raw data
Facile use of computer operating system is absent
Biological interpretation
Biology experiment complete
Thorough mining of the data for useful information
Obstacles that thwart a successful analysis of micro-array data Obstacles that thwart a successful analysis of micro-array data
![Page 2: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/2.jpg)
1. Interrogates thousands of genes. (12,000 55,000 28,869)
2. Versatile with respect to tissues.
3. Recently expanded beyond major biomedical research models.
4. Asks which genes are affected by a treatment?
5. Equivalent to 35,000 northern blots overnight.
6. Time course experiments gain immense value.
Benefits of the Gene Array Approach
![Page 3: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/3.jpg)
Genechip
![Page 4: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/4.jpg)
Generate Affy.dat file
What is covered in this course?
Hyb. cRNAcocktail
Hybridize to Affy arrays
Output as Affy.chp file
Export as Text file
TotRNA
Data mining
Pattern mining
Pathway Analysis
Illumina platform at CCF facility
Case/UH Core Facility
![Page 5: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/5.jpg)
Perfect Match 25 mer DNA oligo
WT Expression Array Design
3’
5’
Only PM used
Perfect MatchMismatch
Probe Set (<= 26 probes)
PMProbe Cell
11m
11m
Validate using Blast and Tm
![Page 6: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/6.jpg)
Total RNA (1-5 mg) AAAAAAAAA
cRNA preparation
cRNA is now ready for hybridization to test chip
cDNA Strand 1 synthesis TTTTTTTTTNNNNNNNNNAAAAAAAAA
SS II reverse transcriptaseT7RNA pol. promoter
cDNA Strand 2 synthesisTTTTTTTTTNNNNNNNNNAAAAAAAAA NNNNN
E. coli DNA pol. I
T7RNA pol. promoter
NNNNNNNN
IVT cRNA synthesis amplifies and labels transcripts with Biotin
NNNNNNNNNNNNNAAAAAAAAAAAAAAN
TTTTTT T T T T T
UUUUUUUUUU
………..UUUUUUUUUU………..
UUUUUUUUUU………..
UUUUUUUUUU………..
UUUUUUUUUU………..
T7 RNA pol. TT
Fragmented cRNA
1. Conversion to cRNA2. Amplification (linear)3. Labelling (biotin)
![Page 7: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/7.jpg)
Chips are placed in the Fluidics station where they are washed, stained and washed again (2.5 hours)
After staining, the signal intensities are measured with a laser scanner (15 min)
Data is acquired by the computer as soon as the scan has been completed.
Chip is placed in a hybridization oven and incubatedovernight
Hybridization cocktail
Affymetrix Array Chip
Sample is added to a hybridization cocktail along with spiked control transcripts and is loaded onto an array chip
![Page 8: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/8.jpg)
![Page 9: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/9.jpg)
The first image is “sample1.dat.” note the pixel to pixel variation within a probe cell
A “*.cel.” file is automatically generated when the “*.dat” image first appears on the screen. Note that this derivative file has homogenous signal intensity within its probe cells
![Page 10: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/10.jpg)
Sample 1 Sample 2 Sample 3Gene
1
Gene
2
Gene
3
g1p1
g1p2
g1p3
g1p4
G
G
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p1
g1p2
g1p3
g1p4
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p1
g1p2
g1p3
g1p4
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p3
g2p4
g1p2
g3p2
g1p1
g3p1
g2p3
g2p2
g3p3
g2p1
g1p4
g3p4
g2p1
g2p3
g3p4
g2p2
g1p1
g3p1
g3p3
g2p4
g1p2
g1p3
g1p4
g3p2
g1p4
g2p3
g1p1
g3p2
g2p2
g1p3
g3p1
g3p3
g3p4
g1p2
g2p1
g2p4
Average
How do we get the individual gene signals using RMA in EC?
![Page 11: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/11.jpg)
Sample 1 Sample 2 Sample 3Gene
1
Gene
2
Gene
3
g1p1
g1p2
g1p3
g1p4
G
G
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p1
g1p2
g1p3
g1p4
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p1
g1p2
g1p3
g1p4
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p3
g2p4
g1p2
g3p2
g1p1
g3p1
g2p3
g2p2
g3p3
g2p1
g1p4
g3p4
g2p1
g2p3
g3p4
g2p2
g1p1
g3p1
g3p3
g2p4
g1p2
g1p3
g1p4
g3p2
g1p4
g2p3
g1p1
g3p2
g2p2
g1p3
g3p1
g3p3
g3p4
g1p2
g2p1
g2p4
![Page 12: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/12.jpg)
Sample 1 Sample 2 Sample 3Gene
1
Gene
2
Gene
3
g1p1
g1p2
g1p3
g1p4
G
G
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p1
g1p2
g1p3
g1p4
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p1
g1p2
g1p3
g1p4
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p3
g2p4
g1p2
g3p2
g1p1
g3p1
g2p3
g2p2
g3p3
g2p1
g1p4
g3p4
g2p1
g2p3
g3p4
g2p2
g1p1
g3p1
g3p3
g2p4
g1p2
g1p3
g1p4
g3p2
g1p4
g2p3
g1p1
g3p2
g2p2
g1p3
g3p1
g3p3
g3p4
g1p2
g2p1
g2p4
216 50 150
150 300 120
95 112 110
![Page 13: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/13.jpg)
SOMs Hierarchical clustering
Plaid clustering
Diff Call
NC
I
MI
MD
D
FoldChange
10.54.915
-11.8-3.7
Probe set Pairs Pairs used
Pos Neg Ave Diff
YDL200C 20 18 16 2 2378 P
YDL200D 20 19 16 3 237
YDM167A 20 14 7 7 5003
Abs. Call
M
A
Data manipulation is essential prior to submission of results to third party clustering and analytical programs
![Page 14: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/14.jpg)
![Page 15: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/15.jpg)
SOMs
Self organizing maps or SOMs are a popular method for detecting patterns in large data sets
![Page 16: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/16.jpg)
Sample 1 Sample 2 Sample 3Gene
1
Gene
2
Gene
3
g1p1
g1p2
g1p3
g1p4
G
G
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p1
g1p2
g1p3
g1p4
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p1
g1p2
g1p3
g1p4
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p3
g2p4
g1p2
g3p2
g1p1
g3p1
g2p3
g2p2
g3p3
g2p1
g1p4
g3p4
g2p1
g2p3
g3p4
g2p2
g1p1
g3p1
g3p3
g2p4
g1p2
g1p3
g1p4
g3p2
g1p4
g2p3
g1p1
g3p2
g2p2
g1p3
g3p1
g3p3
g3p4
g1p2
g2p1
g2p4
Average
How do we get the individual gene signals using RMA in EC?
![Page 17: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/17.jpg)
Sample 1 Sample 2 Sample 3Gene
1
Gene
2
Gene
3
g1p1
g1p2
g1p3
g1p4
G
G
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p1
g1p2
g1p3
g1p4
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p1
g1p2
g1p3
g1p4
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p3
g2p4
g1p2
g3p2
g1p1
g3p1
g2p3
g2p2
g3p3
g2p1
g1p4
g3p4
g2p1
g2p3
g3p4
g2p2
g1p1
g3p1
g3p3
g2p4
g1p2
g1p3
g1p4
g3p2
g1p4
g2p3
g1p1
g3p2
g2p2
g1p3
g3p1
g3p3
g3p4
g1p2
g2p1
g2p4
![Page 18: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/18.jpg)
Sample 1 Sample 2 Sample 3Gene
1
Gene
2
Gene
3
g1p1
g1p2
g1p3
g1p4
G
G
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p1
g1p2
g1p3
g1p4
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p1
g1p2
g1p3
g1p4
g2p1
g2p2
g2p3
g2p4
g3p1
g3p2
g3p3
g3p4
g1p3
g2p4
g1p2
g3p2
g1p1
g3p1
g2p3
g2p2
g3p3
g2p1
g1p4
g3p4
g2p1
g2p3
g3p4
g2p2
g1p1
g3p1
g3p3
g2p4
g1p2
g1p3
g1p4
g3p2
g1p4
g2p3
g1p1
g3p2
g2p2
g1p3
g3p1
g3p3
g3p4
g1p2
g2p1
g2p4
216 50 150
150 300 120
95 112 110
![Page 19: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/19.jpg)
7% not transcribed
1% ORF
1%UTR
35-40% Intron
Non-protein-coding RNAs
The information content of the human genome
ENCODE Consortium (Nature 2007 Vol 447: 799-816)
The Human Genome
Protein-coding genes}
Small RNAs
~10%
Functional LongncRNAs
![Page 20: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/20.jpg)
The increase in complexity among eukaryotes is concomitant with an increase in the ratio of non-coding to coding DNA
Mattick, 2007
![Page 21: Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.](https://reader036.fdocuments.us/reader036/viewer/2022062409/56649c9d5503460f9495c128/html5/thumbnails/21.jpg)
Application of available statistical tools
Development of specific, more appropriate statistical tools for use with microarrays
Functional annotation of results
Inadequate Computer skills to handle large datasets
Intimacy with nature (strengths and deficiencies) of the raw data
Facile use of computer operating system is absent
Biological interpretation
Application of available statistical tools
Functional annotation of results
Inadequate Computer skills to handle large datasets
Intimacy with nature (strengths and deficiencies) of the raw data
Facile use of computer operating system is absent
Biological interpretation
Biology experiment complete
Thorough mining of the data for useful information
Obstacles that thwart a successful analysis of micro-array data Obstacles that thwart a successful analysis of micro-array data