New Bioprospecting of Genes and Allele Mining: Approaches and … · 2012. 3. 28. ·...

49
Bioprospecting of Genes and Allele Mining: Approaches and Oppertunities T. Mohapatra Central Rice Research Institute, Cuttack, Odisha

Transcript of New Bioprospecting of Genes and Allele Mining: Approaches and … · 2012. 3. 28. ·...

Bioprospecting of Genes and Allele

Mining: Approaches and Oppertunities

T. Mohapatra

Central Rice Research Institute, Cuttack, Odisha

Prospecting the Biological Resources: Bio-prospecting

• It is a systematic search for whole organisms, genes and natural compounds in the living world for useful purposes.

• It is nothing new. Informal bio-prospecting began when prehistoric people noticed that one plant root tasted better than another, or some plants could be used as medicines to treat various human diseases.

Bio-prospecting

• Scientific bio-prospecting started later to identify the active ingredients present in different organisms and isolate or replicate them for large-scale use.

• Alexander Fleming’s discovery of the antibiotic penicillin is an example of bio-prospecting that happened accidentally.

Scottish-born microbiologist

Penicillium

notatum

Bio-prospecting of Genes • Genes are the functional hereditary

units in our chromosomes.

• Genes control expression of different characteristics. For example, dwarf height, early flowering, disease resistance, heat tolerance, high yield etc.

• Genes for dwarf plant height in rice and wheat were also associated with higher crop yield that ushered in “Green Revolution”.

• Bio-prospecting of genes aims at discovery of novel genes in biological resources

Bio-prospecting of Genes: Bt Example • Identification of soil bacterium

Bacillus thuringiensis (Bt) to have insecticidal properties and subsequent isolation of the gene responsible for this characteristic from the bacteria is a bright example of bio-prospecting of extremely useful genes.

• A revolution in cotton production in the country has been possible with deployment of Bt gene that works effectively against the cotton bollworm.

• The technology is highly remunerative to the farmers and environment friendly.

Genes prospected from diverse biological resources for tolerance to abiotic stresses

Stress Gene/Enzyme Source

Osmotic Delta-pyrrolin-5-carboxylate

synthetase (P5CS)

Mothbean

(V. aconitifolia)

Drought and

Salinity

Mannitol-1-phosphate

dehydrogenase (mt1D)

E. coli

Cold and Salt Choline oxidase (cod A) Arthrobactor

globiformis

Salt Choline dehydrogenase (bet A) E. coli

Cold Omega-3-fatty acid desaturase

(fad 7)

Arabidopsis

Drought Trehalose-6-phosphate synthase Yeast

Drought Levan sucrase (Sac B) Bacillus subtilis

Tools of biotechnology have rendered the traditional barriers to

gene flow less relevant

Bio-prospecting of Genes: The Approaches

• Using heterologous probes or primers

• Using purified protein

– Antibody to screen cDNA library

– Degenerate oligos to screen library

• Transcriptome profiling

• Insertional mutagenesis

• Map-based cloning

• Integrated approach

Transcriptome Profiling

• Differential Display RT-PCR

• cDNA AFLP

• Representational Difference Analysis

• Suppression Subtractive Hybridization

• Microarrays

• Serial Analysis of Gene Expression

• Massively Parallel Sequencing

Suppression Subtractive Hybridization

• Selectively amplifies target cDNA fragments (differentially expressed) and simultaneously suppresses non-target DNA amplification

• Based on the suppression PCR effect

• Normalization is included

• No need to physically separate single stranded and double stranded DNAs

SSH • Can detect as little as 0.001%

target

• Critical factor is relative concentration of target in tester and driver populations

• Effective enrichment when:

– Target present at ≥ 0.01%

– Concentration ratio ≥ 5-fold

Identification of a testis specific gene

Use of DNA Microarray

• Array design: choice of sequences to be used as probes

• Analysis of scanned images

– Spot detection, normalization, quantitation

• Primary analysis of hybridization data

– Basic statistics, reproducibility, data scattering, etc.

• Comparison of multiple samples

– Clustering, classification …

• Sample tracking and databasing of results

Comparative hybridization Bioinformatics

Transcriptome Profiling Using Illumina Solexa Genome

Analyzer NGS Platform

Synthesis by Polymerase and sequence

generation by Bridge PCR

Generation of 75 to 100

bp short sequence reads

Metzker et al. (2009) Nat. Rev.

Prospecting Genes Using Insertion Tags

P coding region

P coding region Gene X

Gene X with

insert

Chromosome with genes including the one with insert

Identify and clone the fragment carrying the insert Eg. 1. Make a library and probe with the insertion sequences. 2. Ligate DNA into circles and amplify using divergent insert primers (inverse PCR)

Digest genomic DNA with restriction

endonuclease

Inverse PCR Ligate

Clone and sequence

Amplify by PCR

Use sequences from gene X to identify and clone wild type allele

P coding region Gene X

RG16 RG2

Gene Prospecting by Chromosome Landing

Chromosome Landing

RM5897 RM2634

Gene

Artificial

Chromosome

150kb

RG16 RG2

Cloning of GW2 QTL for Grain Weight FAZ1 (indica) X WY3 (japonica)

• Mapped several QTL for grain weight or size, including width, length, thickness and 1,000-grain weight.

• This included a major QTL for grain width, GW2, on chromosome 2 with the WY3 allele at GW2 contributing to increased grain width.

41.9 ± 1.3 g

test weight

17.9 ± 0.7 g

test weight F1

F2

RM5897 RM2634

GW2

High Resolution Mapping in BC3F2

• 6013 individuals phenotyped and genotyped

Structure of GW2

425 vs 310 aa long polypeptides

Mining is the extraction of valuable minerals or other geological materials from the earth

Mining in a wider sense

comprises extraction of

any non-renewable

resource (e.g., petroleum,

natural gas, or even

water)

What is an Allele? • An allele from the Greek allelos,

meaning each other

• The word is a short form of

allelomorph ('other form'), which is

used to describe variant forms of a

gene detected as different

phenotypes.

• For example, at the gene locus for

ABO blood type proteins in human;

eye colour in fruit fly.

• Alleles are now understood to be

alternative DNA sequences at the

same physical locus, which may or

may not result in different phenotypic

traits.

Microsatellite Alleles

1 2 3

PCR - Amplification

Variety 1

Variety 2

Variety 3

Genomic DNA

GA GA GA GA GA GA GA GA GA GA GA GA

GA GA GA GA GA GA GA GA GA GA GA GA GA GA GA GA

GA GA GA GA GA GA GA GA

12 16 8

Alleles at the Sequence Level

Different

alleles

present in

different

germplasm

lines

Known gene that is

being used in breeding

Allele 1

C C A

G A T

Alternate forms of a gene - Alleles

G C C

A C A

G C A

T G C

Allele 2

Allele 3

Allele 4

Allele 5

A T A

A T C

Allele 6

Allele 7

Prospecting New Alleles: Allele Mining

Yield

under

stress

Allele 3 present in one of the germplasm line is the best;

can be identified through association mapping

Allele 1 Allele 2 Allele 7 Allele 6 Allele 3 Gene

Resistant landrace

with a new allele for

powdery mildew

resistance

Resistant landrace

with the new allele

silenced shows

powdery mildew

symptom

• A core collection of 1320 wheat

landraces screened for powdery

mildew reaction

• 211 landraces showed complete

or partial resistance

• 111 landraces had Pm3 gene

specific marker

• Sequencing, sequence analysis

and functional validation by

gene silencing were done to

identify new alleles

• 7 new functionally active alleles

of Pm3 gene identified

TILLING (Targeting Induced Local Lesions In Genomes): An Approach for Detecting New Alleles

LI-COR Model 4300

DNA Analyzer for SNP

detection

Allele Mining Using Next Generation

Sequencing (NGS) Platforms

Roche 454 Pyrosequencer

Illumina Solexa Genome Analyzer

ABI SOLiD

Crop species SNPs mined NGS platforms

Ref

Rice (Nipponbare vs. 93-11) Rice (Nipponbare vs. Koshihikari)

1,226, 791 67,051

Solexa Solexa

Huang et al. (2009) Yamamoto et al. (2010)

Arabidopsis 8,23,325 Solexa Ossowski et al. (2008)

Chickpea ~1000 454, Solexa and ABI SoLiD

Unpublished

Eucalyptus 23,742 454 Novaes et al. (2008)

Wheat ~1000

454 Cronn et al. (2008)

Whole Genome Resequencing

Targeted Resequencing

Transcriptome Sequencing

De novo Sequencing

Mining SNPs among

even thousands of related

genotypes, and within

and between larger

germplasm pools more

efficiently & economically

in greater depth

Whole Genome Resequencing Using ABI SOLiD

Isolated high quality column purified

genomic DNA with total quantity of 30-

40 µg from three rice varieties

Mate-paired Library Preparation

Emulsion PCR & Bead Enrichment

Bead Deposition Sequencing by ligation

Dual Interrogation of each base

Chracteristics Basmati370 IRBB60 Taraori

Total mappable

reads

63,867,955 58,993,349 69,938,730

% mapping 49.78 53.27 56.0

Sequence

coverage (x)

7.43 6.86 8.13

Used Bioscope v1.2 for

mapping, pairing and SNP

detection

Rice

Pseudomolecule

6.1: Reference

Genome

Illumina Infinium Assay for Allele Mining

Staining

Imaging

Population Genetic Structure Among Domesticated Rice Genotypes

Aromatics indica ja

po

nica

Au

s

Wild

Requirements for Large Scale Bio-prospecting and Allele Mining

• Ensuring the richness of bio-diversity

• Consortium mode of operation

• Pooling of expertise available in different institutions

• Detailed characterisation of the germplasm lines

• Use of modern tools of genomics

• Adequate funding support and infrastructure

BIOPROSPECTING OF GENES AND ALLELE MINING FOR ABIOTIC

STRESS TOLERANCE

Consortium Leader NRC on Plant Biotechnology

Pusa Campus, New Delhi-110012

Consortium Partners 36 partner institutes

Crop Sciences, Horticultural Sciences, Animal Sciences, Fisheries & Microbial Sciences

RICE GROWING AREAS IN INDIA

Rice accounts for about 42%

of total food grain production

and >55% of diet in India

Rice is considered to

have originated in

the Himalayan foot-hills

It is cultivated below the sea

level in the Kuttanad district

of Kerala state as well as at

altitude of 2000 meters

above the mean sea level in

the hills of Jammu &

Kashmir, Uttaranchal and

North-Eastern States.

Bio-prospecting of Genes and Allele Mining: Enormous Opportunities

• The microbial diversity thriving in extreme heats of deserts, hot springs and even volcanic eruptions is far more striking

• They are the source of genes that can help crop plants to overcome a range of abiotic stresses

Bio-prospecting of Genes and Allele Mining: Enormous Opportunities

The animals such as camel and goat possess unique adaptive mechanisms for abiotic stress tolerance that need to be understood and exploited.

Nubra valley in J&K, -300 - +150C Rajasthan, 50 - 500C

Goat Biodiversity

Gaddi, Chegu

Changthangi

Temperature range

20°C to -20°C

Cold region Hot humid regions

Ganjam

Black Bengal

Malabari

Hot arid region

Marwari

Surti

Sirohi

Bio-prospecting of Genes and Allele Mining: Catfish, Trout, Shrimp and Prawn

The fish species such as trout, catfish and shrimp/prawn possess unique adaptive mechanisms for tolerance to cold, anoxia and salinity.

Traits, Source Organisms and Institutions: Sharing of Responsibilities

Trait Source species

Participating Institutes

Moisture stress

Microbes Rice Maize Sorghum Lathyrus Cucumis Ziziphus

NBAIM, IARI, NRCG NRCPB, NBPGR, IARI, DRR, CRRI, IGKVV, DUSC, IIT – Kharagpur, VPKAS IARI, VPKAS, ICAR Research Complex for NEH Region, CCSHAU, GBPUAT NRCS, MPKV-Rahuri NBPGR, IIT-Kharagpur NBPGR, IIVR, IIHR, RAU-Bikaner CIAH, NRCPB, NBPGR

Salinity/ Sodicity and acidity

Microbes Rice Shrimp

NBAIM, CMFRI, CIFT, CIFA, IARI, CIFRI, NRCG DRR, CRRI, CAU-Barapani, ICAR Research Complex for NEH Region, NRCPB, DUSC CIBA, CIFA, CIFE

Trait Source species

Participating Institutes

Temperature (Heat and Cold)

Microbes Rice Vigna Spp. Goat Camel Trout fish

NBAIM, IARI, CIFA, NRCG CRRI, VPKAS, CAU-Barapani, ICAR Research Complex for NEH Region, IARI, NRCPB, DUSC NBPGR, RAU – Bikaner CIRG, NBAGR, SKUAST, IVRI, NDRI NBAGR, NRCC, SKAUST, IVRI, NDRI DCFR

Submergence (Anoxia)

Microbes Catfish Rice

CRRI, CIFT, CMFRI NBFGR, CIFA CRRI, NRCPB, DUSC

Statistical and Computational Genomics

Across species IASRI and other participating institutes

Traits – Eight (Moisture stress, salinity, sodicity, acidity, heat, cold, submergence and anoxia)

Species: Microbes – several; Plant – Seven; Fish – Four; Animals – Two

Institutions - 36 (ICAR Institutes, SAUs, CUs and IIT)

Traits, Source Organisms and Institutions: Sharing of Responsibilities

Our Focus

Emphasis on the master regulators such as transcription factors and signaling components while prospecting genes and alleles for complex abiotic stress tolerance traits

Prospecting Genes for Moisture Stress Tolerance

AB

A

DRFB2A, 2B

pZIP

MYC/MYB

DRE/CRT ABRE MYCRS/MYBRS mRNA

Involvement of different transcription factors in response to moisture

stress in the induction of stress genes.

Various transcription factors. Osmatic stress signaling generated

moisture stress seems to be mediated by transcription factors such as

DREB 2A, 2B and Pzip and MYC and MYB as transcription activators

that interact with CRT/DRF, ABRE or MYCRE/MYBRE elements in the

promoter of stress genes.

Abiotic Stress Responsive Gene

Statistical and Computational Genomics

Data analysis support

Development of algorithms Capacity building

Creation of centralized facility

The statistical and computational aspect is complex and enormous.

It will be a beginning. As we go along, need based further

strengthening will take place with more resources and manpower.

Installed Applied Biosystem (ABI) 3730xl DNA Analyzer for high-throughput genotyping of rice core collections using multiplexed

microsatellite markers

Creation of a High Throughput Genotyping Facility

6912 core germplasm samples can be genotyped for 4 markers in 5 days

Genotyping using 72 markers will require 90 days of uninterrupted run

Installed Bead-array Platforms for highly Multiplexed Illumina

GoldenGate and Infinium Assays for High-throughput

Genotyping of SNP Markers

Illumina Beadexpress Illumina iScan

What is Gained from Bio-prospecting and Allele Mining

NOVEL GENES TO MEET THE

CHALLENGES OF CLIMATE CHANGE

NEW ALLELES OF KNOWN GENES FOR

TARGET TRAIT

ELITE GENOTYPES WITH DESIRABLE

TRAIT

WELL CCHARACTERIZED GERMPLASM

RESOURCES

NEW FACILITIES FOR HIGH THROUGHPUT

GENOTYPING, PHENOTYPING AND

COMPUTATIONAL GENOMICS