Chip arrays and gene expression data. With the chip array technology, one can measure the expression...

43
Chip arrays and gene expression data
  • date post

    15-Jan-2016
  • Category

    Documents

  • view

    221
  • download

    0

Transcript of Chip arrays and gene expression data. With the chip array technology, one can measure the expression...

Page 1: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Chip arrays and gene expression data

Page 2: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions such as:

1.Which genes are expressed in a muscle cell?

2.Which genes are expressed during the first weak of pregnancy in the mother? In the new baby?

3.Which genes are expressed in cancer?

Page 3: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

4. If one mutates a TF: which genes are not expressed following this change?

5. Which genes are not expressed in the brain of a retarded baby?

6. Which genes are expressed when one is asleep versus when the same person is awake?

Page 4: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

DNA chip: in each spot there’s a specific marked DNA molecule. Upon hybridization with a marked mRNA molecule (or cDNA one) – the intensity of the hybridization can be quantified by light.

Page 5: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Affymetrix: The base is a “wafer” מצע גבישי מוליך למחצה דק

A light-sensitive chemical compound that prevents coupling between the wafer and the first nucleotide of the DNA probe being created.

Page 6: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

The blue “cap” is light sensitive. A mask is added to some of the cells. When the cells are illuminated, only where there is light – a reaction with a nucleotide can happen.

Affymetrix

Page 7: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

The nucleotide that is added is also chemically linked with a new “cap” (light sensitive).

Affymetrix

Page 8: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

The entire process is called photolithography

Affymetrix

Page 9: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Affymetrix

Page 10: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Affymetrix: each probe is 25 bp – a part of an exon.

The readerThe chip itself

In one cm2 > 106 different oligos.

Affymetrix

Page 11: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Affymetrix: each probe is 25 nucleotides. Above this, a technological problem exists: the synthesis becomes inaccurate.

With such short probes, each mRNA can hybridize to more than one probe. The solution, each gene is “covered” by several probes.

Affymetrix

Page 12: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Affymetrix: one can buy ready-made chips (human genome, mouse genome), or he can design (“print”) his own chip (more expensive).

Affymetrix

Page 13: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Detection: mRNA is isolated from the tissue

Affymetrix

(cells, viruses). cDNA is synthesized. The cDNA is fluorescently labeled. Sometimes, the cDNA is amplified using PCR. The intensity in each cell (probe) is measured by “the reader”.

Page 14: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

AgilentDeveloped DNA printers – in each spot pico-liters of nucleotides are added. They can make probes up to 60 mers (Agilent is derived from Hewlett-Packard).

Agilent

Standard phosphoramidite chemistry

Page 15: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Hybridization to Agilent probes is more accurate.

If there is hybridization, to a probe, the gene it represents is probably expressed.

Agilent

Page 16: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

But, it is impossible to know how many probes are in each cell. So absolute fluorescent intensities are meaningless.

Agilent

Page 17: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Solution, in the same experiment, hybridize samples with two conditions: healthy mRNA (in Red) versus tumor cells (green).

The Agilent reader will give the ratio of the two colors.

Agilent

Page 18: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

In this approach, long cDNA sequences (>300bp) are produced in a cell (a clone) and are linked to each chip cell. Producing long cDNA rather than synthesizing them a nucleotide at a time is cheaper!

As in the case of Agilent, it is impossible to control the number of probes in each cell.

Stanford cDNA chips

Page 19: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Stanford cDNA chips

Page 20: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Output

w.tBrain tumor

males

Brain tumor

females

Gene 1

Gene 2

Gene 3

Gene 25,000

Each cell is either an absolute number or a relative one, depending on the technology used.

Page 21: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Repeats

w.tBrain tumor

male1

Brain tumor

male2

Brain tumor

female1

Gene 1

Gene 2

Gene 3

Gene 25,000

The repeat can either be the same sample – a different chip or a “real” biological repeat – a different sample.

Page 22: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Expression profile

wt1wt2wt3wt4bt1bt2bt3bt4

g1435415161723

g275466379

g3232525263060

Genes 1 and 3 show the same trend (go both high under the same conditions). That is: they have the same expression profile.

Page 23: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Clustering

wt1

wt2

wt3wt4bt1bt2bt3bt4

g1435415161723

g275466379

g3232525263060

In general, we want to find all the genes that share the same expression profile → suggestive of a functional linkage.

There are clustering algorithms, which do exactly that.

Page 24: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Clustering

wt1

wt2

wt3wt4bt1bt2bt3bt4

g14354022023

g275460809

g32325601661

Clustering of the conditions can suggest two types of brain tumor (bt)

Bi-clustering: both on the conditions and the genes.

Page 25: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Applications

Think of increasing the glucose concentration of E.coli and making a chip array in various concentration.

One can potentially discover all genes in the glucose pathway.

Knocking out a gene → discover all genes that interact with it.

Page 26: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Applications

Analyzing expression of genes can help reveal the gene network of a given organism.

Page 27: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Gene network

Page 28: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Clinical

/

g111

g24

g30

Do someone has a brain tumor?

wt1

wt2

wt3wt4bt1bt2bt3bt4

g14354022023

g275460809

g32325601661

Page 29: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Sequence by hybridizationIt was thought that the following procedure could work for sequencing a genome:

1.Make a chip containing all x mers (e.g., x = 25).2.Hybridize a genome to the chip.3.By analyzing all the hybridizations with their overlaps – assemble the genome.

Problem: it doesn’t work.

Page 30: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

ChIP-on-chip : A method for measuring protein-DNA interaction.

Proteins that bind DNA includes:

Those responsible for transcription regulation

Transcription factors (TFs)

Replication proteins

Histones…

Page 31: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

ChIP-on-chip: One chip is for Chromatin ImmunoPrecipitation and the second chip is for DNA microarrays.

The method is used mostly to detect TF binding sites.

Page 32: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

ChIP-on-chip:

Page 33: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Tiling arrays

Here the chip array should include not only protein coding genes but also control regions, or simply – the entire genome.

Page 34: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Protein-Protein interaction

Page 35: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Some facts:

• Human genome, 20,000-30,000 genes, more then 500,000 proteins. At a given time in a cell 10,000 proteins are present. (Proteome).

• Estimate of >80% of proteins interact.

• The network includes hubs.

Page 36: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Large scale studies of protein-protein interactions (PPIs) give very noisy data:

40-80% of interactions are false negatives (true interactions that are unidentified).

30-60% of interactions are false positives

(interactions that are inferred but are not real).

Page 37: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Method 1: affinity tag purification of complexes in vivo.

Cell

Say we want to know what interact with protein X.We construct a plasmid with the gene coding for X (filled box) fused to a bait (empty box).

Page 38: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

In the cell, protein X fused to the bait is expressed, and interacts with some proteins.

The cells are lysed and the protein complex is isolated using a solid support linked to a ligand that can interact with the bait.

Method 1: affinity tag purification of complexes in vivo.

Page 39: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Bound proteins are eluted, separated on a gel and identified using mass spectroscopy (MS).

The method is biased towards proteins of high abundance.

Method 1: affinity tag purification of complexes in vivo.

Page 40: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Method 2: yeast two hybrid system.

Some transcription factors are composed of two domains: BD which Binds the DNA and AD (in red), which activate transcription. They need to interact in order to express the gene.

Page 41: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

yeast two hybrid system.

In order to check if protein A (bait) interacts with protein B (prey), protein A is expressed fused to AD, and protein B fused to BD. Only if A and B interact – the reporter gene will be expressed.

Page 42: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Databases of protein-protein interactions:

DIPIntActMINTMIPSiHOP

Page 43: Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.

Protein-protein interactions are fundamental for functional annotation.

If X interacts with Y & Y is known to be related to muscle development, maybe X is also related to muscle development.

“Guilt by association”