Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one...
-
Upload
blaze-crockford -
Category
Documents
-
view
216 -
download
2
Transcript of Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one...
![Page 1: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/1.jpg)
Biology and Cells
• All living organisms consist of cells. • Humans have trillions of cells. Yeast - one cell.• Cells are of many different types (blood, skin,
nerve), but all arose from a single cell (the fertilized egg)
• Each cell contains a complete copy of the genome (the program for making the organism), encoded in DNA.
![Page 2: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/2.jpg)
DNA
• DNA molecules are long double-stranded chains; 4 types of bases are attached to the backbone: adenine (A), guanine (G), cytosine (C), and thymine (T). A pairs with T, C with G.
• A gene is a segment of DNA that specifies how to make a protein.
• Human DNA has about 30-35,000 genes; • Rice -- about 50-60,000, but shorter genes.
![Page 3: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/3.jpg)
Exons and Introns: Data and Logic?
• exons are coding DNA (translated into a protein), which are only about 2% of human genome
• introns are non-coding DNA, which provide structural integrity and regulatory (control) functions
• exons can be thought of program data, while introns provide the program logic
• Humans have much more control structure than rice
![Page 4: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/4.jpg)
Gene Expression
• Cells are different because of differential gene expression.
• About 40% of human genes are expressed at one time.
• Gene is expressed by transcribing DNA into single-stranded mRNA
• mRNA is later translated into a protein• Microarrays measure the level of mRNA
expression
![Page 5: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/5.jpg)
Gene Expression Measurement
• mRNA expression represents dynamic aspects of cell
• mRNA expression can be measured with latest technology
• mRNA is isolated and labeled with fluorescent protein
• mRNA is hybridized to the target; level of hybridization corresponds to light emission which is measured with a laser
![Page 6: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/6.jpg)
Molecular Biology Overview Cell Nucleus
Chromosome
ProteinGraphics courtesy of the National Human Genome Research Institute
Gene (DNA)Gene (mRNA), single strand
![Page 7: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/7.jpg)
Gene Expression Microarrays
The main types of gene expression microarrays:
• Short oligonucleotide arrays (Affymetrix);
• cDNA or spotted arrays (Brown/Botstein).
• Long oligonucleotide arrays (Agilent Inkjet);
• Fiber-optic arrays
• ...
![Page 8: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/8.jpg)
DNA Chip Microarrays• Put a large number (~100K) of cDNA sequences or synthetic
DNA oligomers onto a glass slide in known locations on a grid.• Label an RNA sample and hybridize (Label 2 RNA samples with
2 different colors of flourescent dye - control vs. experimental) • Mix two labeled RNAs and hybridize to the chip• Measure amounts of RNA bound to each square in the grid• Make comparisons
– Cancerous vs. normal tissue– Treated vs. untreated– Time course
![Page 9: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/9.jpg)
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Spot your own Chip (plans available for free from Pat Brown’s website)
Robot spotter
Ordinary glass microscope slide
![Page 10: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/10.jpg)
cDNA Spotted Microarrays
![Page 11: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/11.jpg)
Affymetrix “Gene chip” system
• Uses 25 base oligos synthesized in place on a chip (20 pairs of oligos for each gene)
• RNA labeled and scanned in a single “color”– one sample per chip
• Can have as many as 20,000 genes on a chip• Arrays get smaller every year (more genes)• Chips are expensive• Proprietary system: “black box” software,
can only use their chips
![Page 12: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/12.jpg)
Affymetrix Microarrays
50um
1.28cm
~107 oligonucleotides, half Perfectly Match mRNA (PM), half have one Mismatch (MM)Raw gene expression is intensity difference: PM - MM
Raw image
![Page 13: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/13.jpg)
![Page 14: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/14.jpg)
![Page 15: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/15.jpg)
Data Acquisition
• Scan the arrays
• Quantitate each spot
• Subtract background
• Normalize
• Export a table of fluorescent intensities for each gene in the array
![Page 16: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/16.jpg)
Normalization
• Can control for many of the experimental sources of variability (systematic, not random or gene specific)
• Bring each image to the same average brightness
• Can use simple math or fancy - – divide by the mean (whole chip or by sectors)– LOESS (locally weighted regression)
• No sure biological standards
![Page 17: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/17.jpg)
Multiple Comparisons
• In a microarray experiment, each gene (each probe or probe set) is really a separate experiment
• Yet if you treat each gene as an independent comparison, you will always find some with significant differences– (the tails of a normal distribution)
![Page 18: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/18.jpg)
Microarray Potential Applications
• Biological discovery– new and better molecular diagnostics
– new molecular targets for therapy
– finding and refining biological pathways
• Recent examples– molecular diagnosis of leukemia, breast cancer, ...
– appropriate treatment for genetic signature
– potential new drug targets
![Page 19: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/19.jpg)
Microarray Data Analysis Types
• Gene Selection– find genes for therapeutic targets– avoid false positives (FDA approval ?)
• Classification (Supervised)– identify disease (biomaker study)– predict outcome / select best treatment
• Clustering (Unsupervised)– find new biological classes / refine existing ones– Understanding regulatory relationship/pathway– exploration
• …
![Page 20: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/20.jpg)
Microarray Data Mining Challenges
• too few records (samples), usually < 100 • too many columns (genes), usually > 1,000• Too many columns likely to lead to False positives• for exploration, a large set of all relevant genes is
desired• for diagnostics or identification of therapeutic
targets, the smallest set of genes is needed• model needs to be explainable to biologists
![Page 21: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/21.jpg)
Data Preparation Issues
• Thresholding: usually min 20, max 16,000– For older Affy chips (new Affy chips do not have negative
values)
• Filtering - remove genes with insufficient variation– e.g. MaxVal - MinVal < 500 and MaxVal/MinVal < 5– biological reasons– feature reduction for algorithmic
• For clustering, normalize each gene (sample) separately to Mean = 0, Std. Dev = 1
![Page 22: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/22.jpg)
Normalization issues
Within-slide– What genes to use– Location– Scale
Paired-slides (dye swap)– Self-normalization
Between slides
![Page 23: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/23.jpg)
Control RNA Sample Test RNA Sample
Hybridization to microarray filters
Use Phosphor Imager laser scanner to obtain densities of each spot on filter.
radio-labelled
cDNA probes
Reverse-Transcription
Compare densities at each spot to determine if treatment changes gene expression. Compile subset of differentially expressed genes.
Gene Control Test A 1X 3X : : : Z 1X 0.5X
![Page 24: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/24.jpg)
Normalization continued• Intensity-dependent normalization (Yang, YH, 2002 )
– Do M-A plot to check the data distribution, where
– Use Lowess function in R to perform normalization
where c(A) is the lowess fit to the M-A plot
– Transform data by M'=M - c(A). – Locally nonparametric method and is robust to a small
number of differentially expressed genes.
CTAandCTM *log/log 22
)/(log)(/log/log 222 kCTAcCTCT
![Page 25: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/25.jpg)
(R,G) (M,A) Transformation
“Observed” data {(R,G)}
R = red channel signal
G = green channel signal
(background corrected or not)
Transformed data {(M,A)}
M = log2(R/G) (ratio),
A = log2(R·G)1/2 = 1/2·log2(R·G) (intensity)
R=(22A+M)1/2, G=(22A-M)1/2
![Page 26: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/26.jpg)
Normalization• Regression normalization:
– Fit the linear regression model:– Assumption: all the genes on the array have the same
variance (homogeneity)
– Test the significance of the intercept . Fit a linear regression without if it is insignificant.
– Transform the treatment data:– Problem:
• assumption may not hold• nonlinear trend (the third replicates of RL95 data has a slight
quadratic trend) .
iii xy
ii
yy
![Page 27: Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,](https://reader035.fdocuments.us/reader035/viewer/2022062511/551ae212550346b2288b63e9/html5/thumbnails/27.jpg)
Scatter plot of log intensity before and after regression normalization
2 3 4 5 6 7
234567
scatter plot of DMSO vs BAP
log(dmso1)
log(bap1)
2 3 4 5 6 7 8
24
68
scatter plot of DMSO vs BAP
log(dmso2)
log(bap2)
0 2 4 6 8
13
57
scatter plot of DMSO vs BAP
log(dmso3)
log(bap3)
2 3 4 5 6 7
234567
scatter plot after norm
log(dmso1)
log(bap1)
2 3 4 5 6 7 8
24
68
scatter plot after norm
log(dmso2)log(bap2)
2 3 4 5 6 7 8
13
57
scatter plot after norm
log(dmso3)
log(bap3)