Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University,...

31
Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore

Transcript of Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University,...

Page 1: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

Introduction to DNA Microarray

Neha Jain

Lecturer

School of Biotechnology

Devi Ahilya University, Indore

Page 2: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

.

.

• Transcription• Post transcription (RNA stability)• Post transcription (translational control)• Post translation (not considered gene regulation)

the “transcriptome”

Genes can be regulated at many levels

RNA PROTEINDNA

TRANSCRIPTION TRANSLATION

Usually, when we speak of gene regulation, we are referring to transcriptional regulation. The complete set of all genes being transcribed are referred to as the “transcriptome.”

Page 3: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

• In the last dozen years, it has become possible to look at the entire transcriptome in a single experiment!

• High Throughput :- Simultaneous analysis of all genes in a genome.

• The high throughput analysis of all expressed genes is termed as Transcriptome analysis. The expression analysis of the full set of RNA molecules produced by a cell under a given set of conditions.

• Transcriptome analysis facilitates our understand-ing of how sets of genes work together to form metabolic, regulatory, and signalling pathways within the cell.

Page 4: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

Genomic analysis of gene expression

• Methods capable of giving a “snapshot” of RNA expression of all genes

• Can be used as diagnostic profile– Example: cancer diagnosis

• Can show how RNA levels change during development, after exposure to stimulus, during cell cycle, etc.

• Provides large amounts of data• Can help us start to understand how whole systems

function

Page 5: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

Types of Gene Expression Analysis• While there are a number of variations, there are

essentially two basic ways of doing expressed gene analysis—using sequencing-based methods and microarrays.

• These have largely replaced older methods such as subtractive hybridization and differential display.

• Sequencing-based methods are very powerful but have typically been prohibitively expensive.

• However, with recent advances in low-cost, high-throughput next generation sequencing, these methods—referred to as “RNA-seq”—are becoming more common and may soon be dominant.

Page 6: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

RNA-seq

• Although details of the methods vary, the concept behind RNA-seq is simple:• Isolate all mRNA• Convert to cDNA using reverse transcriptase• Sequence the cDNA• Map sequences to the genome

• The more times a given sequence is detected, the more abundantly transcribed it is.

• If enough sequences are generated, a comprehensive and quantitative view of the entire transcriptome of an organism or tissue can be obtained.

Page 7: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

DNA microarrays

.

Page 8: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

DNA microarrays

• Microarrays may eventually be eclipsed by sequence-based methods, but meanwhile have become incredibly popular since their inception in 1995 (Schena et al. (1995) Science 270:467-70).

• DNA microarrays rely on the hybridization properties of nucleic acids to monitor DNA or RNA abundance on a genomic scale in different types of cells

• In other words, the principle behind microarray is the ability of complementary strands of DNA (or DNA and RNA) to hybridize to one another in solution with high specificity.

Page 9: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

Nucleic acid hybridization

Page 10: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

• A microarray (or gene chip) is a slide attached with a high-density array of immobilized DNA oligomers (sometimes cDNAs) representing the entire genome of the species under study.

• Each DNA is attached to solid support– Glass, plastic, or nylon

• Oligomer is spotted on the slide and serves as a probe for binding to a unique, complementary cDNA.

• The cDNA population, labelled with fluorescent dyes or radioisotopes, is allowed to hybridize with the oligo probes on the chip.

• The amount of fluorescent or radiolabels at each spot position reflects the amount of corresponding mRNA in the cell.

• Sets of genes involved in the same regulatory or metabolic pathways can potentially be identified.

Introduction

Page 11: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

The ProcessBuilding the chip:

MASSIVE PCR PCR PURIFICATION AND PREPARATION

PREPARING SLIDES

PRINTING

DNA/RNA preparation:

CELL CULTURE AND HARVEST

RNA ISOLATION

cDNA PRODUCTION

Hybing the chip:

ARRAY HYBRIDIZATION

PROBE LABELING DATA ANALYSIS

POST PROCESSING

Page 12: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.
Page 13: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.
Page 14: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

• For each spot on the microarray, red and green fluorescence signals are recorded. • The two fluorescence images from the scanner are then overlaid to create a composite image, which indicates the relative expression levels of each gene. • Thus, the measurement from the composite image reflects the ratio of the two color intensities. • If a gene is expressed at a higher level in the experimental condition (red) than in the control (green), the spot displays a reddish color. I•f the gene is expressed at a lower level than the control, the spot appears greenish. • Unchanged gene expression, having equal amount of green and red fluorescence, results in a yellow spot. • The colored image is stored as a computer file (in TIFF format) for further processing.

Page 15: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

Biological Question

Sample Preparation

Data Analysis

Microarray Reaction

Microarray Detection

Taken from Schena & Davis

Microarray life cyle

Page 16: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

A typical DNA microarray experiment involves a multistep procedure: • Fabrication of microarrays by fixing properly designed oligonucleotides representing specific genes;• Hybridization of cDNA populations onto the microarray; Scanning hybridization signals and image analysis;• Transformation and normalization of data;• Analyzing data to identify differentially expressed genes as well as sets of genes that are co regulated

Steps of Microarray Experiment

Page 17: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

Some Important Points about Microarray DNA microarrays are generated by fixing oligonucleotides onto a solid support such as a glass slide using a robotic device The probes should be specific enough to minimize cross-hybridization with non-specific genes. This requires BLAST searches against genome databases to find sequence regions with least sequence similarity with non target genes. The probes should be sensitive and devoid of low-complexity regions (a string of identical Nucleotides) The oligonucleotide sequences should not form stable internal secondary structures. Number of programs have been developed for designing probe sequences for microarrays spotting. OligoWiz OligoArray

Page 18: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

Image Processing

Image processing is to locate and quantitate hybridization spots and to separate true hybridization signals from background noise.

The background noise and artifacts produced in this step include nonspecific hybridization, unevenness of the slide surface, and the presence of contaminants such as dust on the surface of the slide.

Computer programs are used to correctly locate the boundaries of the spots and measure the intensities of the spot images after subtracting the background pixels.

After subtracting the background noise, the array signals are converted into numbers and reported as ratios between Cy5 and Cy3 for each spot.

Page 19: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

ArrayDB(http://genome.nhgri.nih.gov/arraydb/)

ScanAlyze(http://rana.lbl.gov/EisenSoftware.htm)

TIGR Spotfinder (http://www.tigr.org/softlab/) are Windows program for microarray image processing using the TIFF image format.

Page 20: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

Data Transformation and NormalizationFollowing image processing, the digitized gene expression data need to be further processed before differentially expressed genes can be identified.

This processing is referred to as data normalization and is designed to correct bias owing to variations in microarray data collection rather than intrinsic biological differences.

When the raw fluorescence intensity Cy5 is plotted against Cy3, most of the data are clustered near the bottom left of the plot, showing a non-normal distribution of the raw data.one way to improve the data discrimination is to transform Raw Cy5 and Cy3 values by taking the logarithm to the base of 2.

Page 21: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

• This has the major advantage that it treats differential up-regulation and down-regulation equally, and also has a continuous mapping space. • For example, if the expression ratio is 1, then log2(1) equals 0 represents no change in expression. If the expression ratio is 4, then log2 (4) equals +2 and for expression ratio of log2(1/4) equals -2. • Thus, in this transformation the mapping space is continuous and up-regulation and down-regulation are comparable. Normalization :-When one compares the expression levels of genes that should not change in the two conditions (say, housekeeping genes), what one quite often finds is that an average expression ratio of such genes deviates from 1. This may be due to various reasons, for example, variation caused by differential labelling efficiency of the two fluorescent dyes or different amounts of starting mRNA material in the two samples. Thus, in the case of microarray experiments, as for any large-scale experiments, there are many sources of systematic variation that affect measurements of gene expression levels.•Normalization is a term that is used to describe the process of eliminating such variations to allow appropriate comparison of data obtained from the two samples.

Page 22: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

A method to normalize the data is by using Lowess (locally weighted scatter plot smoother)regression method.

The following two software programs that are freely available are specialized in image analysis and data normalization.

ArrayplotSNOMAD

Page 23: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

• One of the reasons to carry out a microarray experiment is to monitor the expression level of genes at a genome scale. The processed data, after the normalization procedure, can then be represented in the form of a matrix, often called gene expression matrix Each row in the matrix corresponds to a particular gene and each column could either correspond to an experimental condition or a specific time point at which expression of the genes has been measured. Once we have obtained the gene expression matrix additional levels of annotation can be added either to the gene or to the sample. For example, the function of the genes can be provided, or the additional details on the biology of the sample may be provided, such as ʻdisease stateʼor ʻnormal stateʼ.• Depending on whether the annotation is used or not, analysis of gene expression data can be classified into two different types, • Supervised learning, we do use the annotation of either the gene or the sample, and create clusters of genes or samples in order to identify patterns that are characteristic for the cluster. •Unsupervised learning, the expression data is analysed to identify patterns that can group genes or samples into clusters without the use of any form of annotation. For example, genes with similar expression profi les can be clustered together without the use of any annotation.

Statistical Analysis to Identify Differentially Expressed Genes

Page 24: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

Statistical Analysis to Identify Differentially Expressed Genes

• To separate genes that are differentially expressed, a normalization cut off of twofold as a criterion•. But a data point above or below the cut off line could simply be there by chance or because of error. • The only way to ensure that a gene that appears to be differentially expressed is truly differentially expressed is to perform multiple replicate experiments and to perform statistical testing.•The repeat experiments provide replicate data points that offer information about the variability of the expression data at a particular condition. •The main hindrance to obtaining multiple replicate datasets is often the cost: microarray experiments are extremely expensive for regular research laboratories.•To do the statistical analysis two test are used : ANOVA (analysis of variance) and T-Test•Softwares MA-ANOVA Cyber-T

Page 25: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

Microarray Data Clustering

• One of the goals of microarray data analysis is to cluster genes or samples with similar expression profiles together, to make meaningful biological inference about the set of genes or samples.

•The similar expression patterns are often a result of the fact that the genes involved are in the same metabolic pathway and have similar functions.

•The genetic basis of the co regulation could be the result of common promoters and regulatory regions.

Page 26: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

Clustering is one of the unsupervised approaches to classify data into groups of genes or samples with similar patterns that are characteristic to the group. Clustering methods can beHierarchical (grouping objects into clusters and specifying relationships among objects in a cluster, resembling a phylogenetic tree)This can be of 2 types Agglomerative (starting with the assumption that each object is a cluster and grouping similar objects into bigger clusters) Divisive (starting from grouping all objects into one cluster and subsequently breaking the big cluster into smaller clusters with similar properties)

Non-hierarchical (grouping into clusters without specifying relationships between objects in a cluster).Non-hierarchical clustering requires predetermination of the number of clusters. Non-hierarchical clustering then groups existing objects into these predefined clusters rather than organizing them into a hierarchical structure.

Page 27: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

Experimental Design for Microarrays

There are a number of important experimental design considerations for a microarray experiment:

• Technical vs biological replicates

• Amplification of RNA

• Dye swaps

Page 28: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

Experimental Design for Microarrays

Technical vs biological replicates

• Technical replicates are repeat hybridizations using the same RNA isolate

• Biological replicates use RNA isolated from separate experiments/experimental organisms

Although technical replicates can be useful for reducing variation due to hybridization, imaging, etc., biological replicates are necessary for a properly controlled experiment

Page 29: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

Experimental Design for Microarrays

Amplification of RNA

• Linear amplification methods can be used to increase the amount of RNA so that microarray experiments can be performed using very small numbers of cells. It’s not clear to what degree this affects results, especially with respect to rare transcripts, but seems to be generally OK if done correctly

Page 30: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

Experimental Design for Microarrays

Dye swaps

When using 2-color arrays, it’s important to hybridize replicates using a dye-swap strategy in which the colors (labels) are reversed between the two replicates. This is because there can be biases in hybridization intensity due to which dye is used (even when the sequence is the same). Normally 2 dyes Cy5(Red Florescence for infected/experimental samples) and Cy3 (Green florescence for Samples)

S1 S2

S1 S2

Page 31: Introduction to DNA Microarray Neha Jain Lecturer School of Biotechnology Devi Ahilya University, Indore.

Thanks