From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing...

52
Institute for Environmental Genomics, Department of Botany and Microbiology, University of Oklahoma, Norman, OK 73019 Jizhong (Joe) Zhou [email protected] ; 405-325-6073 DOE ERSP Annual PI Meeting, Lansdowne, VA, April 7- 9, 2008 From Community Structure to Functions: GeoChip Development and Its Applications to Bioremediation

Transcript of From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing...

Page 1: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Institute for Environmental Genomics, Department of

Botany and Microbiology, University of Oklahoma,

Norman, OK 73019

Jizhong (Joe) Zhou

[email protected]; 405-325-6073

DOE ERSP Annual PI Meeting, Lansdowne, VA, April 7-

9, 2008

From Community Structure to

Functions: GeoChip Development

and Its Applications to

Bioremediation

Page 2: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

• Brief description of GeoChip technologies

• GeoChip 3.0

Functional Gene categories

Automatic pipeline

• Applications of GeoChip 2.0

Responses of ethanol injection experiments

Old Rifle

• Challenges and future perspectives

Outline

Page 3: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

• Program mission

Understand the hydrogeological and biogeochemical

factors that govern the distribution and functioning of

subsurface microbial communities

Develop techniques to quantitatively identify and

quantitate active members of subsurface microbial

communities and relate growth and activity to rates of

biogeochemical reactions

• Goal of this study

Develop and use microarray technologies to

provide a fundamental understanding of the

structure, functions, activities of microbial

communities at DOE sites

Objectives

Page 4: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Some Grand Challenges in 21st Century

Biology

• Linking genomics toecology

Linking genomics toecological processes andfunctions

Responses to CO2, globalwarming and waterprecipitation

• Linking biodiversity toecosystem functions

• Informational scalingFrom cells to individuals,populations,communities, ecosystemsand biosphere.

Spatial, temporal

Page 5: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Challenges in MicrobialEcology

• Extremely high diversity: 5000 species/g soil

• Uncultured: 99% of the microbial species are uncultured

• Small: Difficult to see

• Limited use of morphology: It could not provide enough

information and resolution to differentiate different

microorganisms

• Molecular biology: Greatly advance our understanding of

microbial diversity

• Limitations of conventional molecular biology

tools: Slow, complicated, labor-intensive, insensitive,

unquantitative, and/or small-scale.

• Genomics technologies: Revolution in microbial ecology

Page 6: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

• Open format detection

High throughput Sequencing.;

• 454 sequencing, 250bp, 60-100mb/run

• Solexsa, SOLiD: 35bp, 1-2 gb/run

Proteomics

• Close format

PhyloChip: 16S genes

GeoChip: functional genes

High throughput approaches

Page 7: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Comparisons between open

format and close format detection

Open format Close format

Affected by random

sampling

Yes No

Effects by dominant

populations

Yes No

Finding new things Yes No

Reliable comparison

across samples

? Yes

Page 8: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

• Detecting functions: Geochemical

processes

• Higher resolution: Species-strain level

resolution

• Quantitative: no PCR is involved

Main advantages of GeoChip

compared to other approaches

Page 9: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

• Microarrays: Glass slides or othersolid surface containing thousands of

genes arrayed by automated equipment.

• FGAs contain probes from the genesinvolved in various geochemical,ecological and environmental processes.

C, N, S, P cylcings

Organic contaminant degradation

Metal resistance and reduction

• Typical format: 50mer oligonucleotidearrays

• Useful for studying microbialcommunities

Functional gene diversity and activity

Limited phylogenetic diversity.

GeoChip or Functional Gene Arrays

(FGAs)

Page 10: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Pioneering advances in microarray-

based technologies to address challenges

in microbial community genomics

• Challenges:

Specificity: Environmental sequence divergences.

Sensitivity: Low biomass.

Quantification:

Existence of contaminants: Humic materials, organic

contaminants, metals and radionuclides.

• SolutionsDeveloping different types of microarrays and novel chemistry to

address different levels of specificity.

Developing novel signal amplification strategy to increase

sensitivity

Optimizing microarray protocols for reliable quantification.

Page 11: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

• Specificity, sensitivity, quantitationWu et al. 2001; AEM:67: 5780-5790

Rhee et al. 2004, AEM 70:4303-4317

Tiquia et al. 2004. BioTechniques 36, 664-675

Wu et al. 2004; EST, 38: 6775-6782

He et al, 2007; The ISME J, 1: 67-77

He and Zhou, 2008, AEM, in press

• Probe design criteriaHe et al. 2005. AEM. 71:3753-3760

Liebich et al. AEM, 72:1688-1691

• New probe designing software: CommOligoLi et al. 2005. Nucl. Acids Res. 33:6114-6123

• Whole community genome amplification (WCGA)Wu et al. 2006. AEM: 72:4931-4941.

• Whole community RNA amplification (WCRA)Gao et al, 2007, AEM: 73: 563-571.

• Review:Gentry et al. 2006, Microbial Ecology, 52: 159-175.

Zhou and Thompson, 2002, Curr Opion Biotech: 13:204-207

Zhou, 2003; Curr Opion. Microbiol, 6:288-294

• ApplicationsHe et al, 2007; The ISME J, 1: 67-77

Leigh et al, 2007, The ISME J, 1: 163-179

Yergeau et al, 2007, The ISME J, 1: 134-148.

Zhou et al. 2008. PNAS, in press

Issues related to specificity, sensitivity and quantitation

Page 12: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Empirical criteria for designing

specific probes

Criteria for designing specific probes

• Similarity: 90%

• Stretches: 20 bases

• Free energy: - 35 kcal/mol

• Specific hybridization was also observed for some

probes with less than 98% similarity to the target

sequences, suggesting that strain level of resolution

could be possibly achieved with some 50mer probes

under the hybridization conditions examined.

• Hybridization condition: 50 C, 50%

formamide

• Evaluation of 516 probes with homology

between less than 86% and 100% or a common

identical sequence stretch of at least 16 bases.

• SNR > 2 as positive10 15 20 25 30 35 40 45 50

60

64

68

72

76

80

84

88

92

96

100

% o

vera

ll si

mila

rity

identical sequence stretch between probe and target (bp)

1012

14

16

18

20

-60

-50

-40

-30

70

75

8085

90

hyb

ridiz

atio

n fre

e e

nerg

y (k

cal m

o-1

)

overall s

imila

rity (%

)idententical sequence stretch

between probe and target (bp)

He et al. 2005. AEM.

71:3753-3760

Liebich et al. AEM,

72:1688-1691

Page 13: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Resolution of 50mer oligo arrays

Phylogenetichierarchies

Sequence similarities (%) ± standard deviation†

DsrAB nirS nirK nifH amoA pmoA

Strain 0.93 ± 0.03 0.93 ± 0.04 0.91 ± 0.05 0.93 ± 0.06 0.99 ± 0.01 0.95 ± 0.03

Species 0.73 ± 0.13 0.70 ± 0.18 0.76 ± 0.00 0.82 ± 0.06 0.75 ± 0.11 0.79 ± 0.55

Genus 0.70 ± 0.09 0.67 ± 0.16 0.71 ± 0.07 0.70 ± 0.07 0.71 ± 0.07 0.75 ± 0.15

Family orhigher

0.66 ± 0.13 0.57 ± 0.14 0.62 ± 0.14 0.66 ± 0.10 0.65 ± 0.13 -

• Based on pure cultures

• FGA can achieve species level identification

Page 14: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

M A1 B1 A2 B2 A3 B3 A4 B4 A5 B5 A6 B6 A7 B7 A8 B8 M

10fg

As low as 10fg (2 cells)

can be detected

Whole community genome

amplification (WCGA) approach for

increasing hybridization sensitivity

Quantitative after amplification

Wu et al. 2006. AEM: 72:4931-4941, top 11th most requested paper in AEM.

1-100 ng DNAs are typically used for analysis

Log DNA Template (ng)

-2 -1 0 1 2 3

2.4

2.8

3.2

3.6

4.0

4.4

Gene 7363464: r2=0.96

Gene 10441633: r2=0.96

Gene 3452465: r2=0.99

Gene 11967269: r2=0.96

Gene 9651045: r2=0.98

Page 15: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

• Fluidized Bioreactor

Reactors for removing

nitrate.

8

15 genesAmplified

Unamplified

Whole Community RNA Amplification

WCRA)

• 5 ug RNA --- umamplified.

• 5 ug amplified RNA from 100 ng RNA as template for amplification

• 8 genes are detected in both amplified and unamplified RNAs

• 7 more genes are detected by RNA amplification

Gao et al, 2007, AEM: 73:

563-571. Top 8th most

requested paper in AEM.

Page 16: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

CommOligo --- New oligo probe design

program for community analysis

Number and specificity of designed probes (50-mer) by different programs

Group sequences of nirS and nirK

(842 gene sequences)

Programs used Total

ORFs

ORFs

rejected

Probes

designed

Specific

probe

Non-

specific

Group -specific

ArrayOligoSe ector

842

0

842

117

725

0

OligoArray

842

35

807 70

737 0

OligoArray 2.0

842

51

791

35

756

0

OligoPicker

842

657

185 141

44 0

CommOligo

842

512

330

147

0

183

• Useful for both whole genome microarrays and community arrays

• Able to design group-specific probes

• Better performance than other programs

Li et al. 2005. Nucl. Acids Res. 33:6114-6123

Page 17: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Probes Designed for a Second

Generation FGA (GeoChip 2.0)

• Nitrogen cycling: 5089

• Carbon cycling: 9198

• Sulfate reduction: 1006

• Phosphorus utilization: 438

•Organic contaminant degradation: 5359

•Metal resistance and oxidation: 2303

• 300 probes from community sequences

Total: 23,408 genes

•23,000 probes designed•Will be very useful for community and ecological

studies

Page 18: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

GeoChip for microbial

community analysis

He, Z, TJ Gentry, CW Schadt, L Wu, J Liebich, SC Chong,

Z Huang, W Wu, B Gu, P Jardine, C Criddle, and J.

Zhou. 2007. GeoChip: a comprehensive microarray for

investigating biogeochemical, ecological and

environmental processes. The ISME J. 1: 67-77.

Highlighted by:• A press release by Nature Press Office

•Reported by many Newspapers

• National Ecology Observatory Networks (NEON),

Roadmap

• National Academy of Sciences, Metagenomics report

Page 19: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

New features for GeoChip 3.0

GeoChip 3.0 is more comprehensive and more

representative. It covers >37,700 gene sequences of

290 gene families.

Automatically retrieved sequences by key words is

verified by HUMMER and then unrelated sequences

are removed.

A software package for sequence retrieval, probe and

array design, probe verification, array construction,

array data analysis, and information storage.

Automatic update greatly facilitates the management

of such a complicated functional gene array.

New analysis pipeline with many statistical and

mathematical tools

Page 20: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Summary of GeoChip 3.0 probe and

sequence information by category

Gene category No. of gene

categories

No. of

downloade

d sequences

No. of

sequences for

probe design

Total no.

of probes

designed

Total no.

of CDS

covered

Carbon degradation 24 18337 4092 1924 3192

Carbon fixation 5 4682 2218 887 1614

Methane reduction an d

oxidation 3 4134 1853 447 752

Metal resistance and

reduction 43 28820 9625 3510 7021

Nitrogen cycling 12 20800 19229 4006 7334

Organic remediation 197 55598 18650 7093 12843

Phosphorus utilization 2 1876 1441 471 1069

Sulfur cycling 3 2523 2291 1464 1800

Others (e.g. gyrB ) 1 8163 5252 1040 2089

Total 290 144,933 64,651 20,842* 37,714

Page 21: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Carbon degradation

Gene/category Unique probe Group probe Total probe Total covered CDS

Carbon degradation

acetylglucosaminidase 32 75 107 214

amyA 61 170 231 467

amyX 0 5 5 12

apu 4 2 6 8

ara 21 65 86 174

ara_fungi 23 10 33 50

cda 11 6 17 25

cellobiase 36 41 77 145

endochitinase 199 168 367 606

endoglucanase 64 24 88 109

exochitinase 15 16 31 63

exoglucanase 54 9 63 83

glucoamylase 23 35 58 111

glx 17 4 21 33

isopullulanase 0 1 1 2

lip 25 4 29 39

mannanase 20 9 29 45

mnp 17 2 19 22

nplT 4 16 20 39

pectinase 27 2 29 33

phenol_oxidase 126 81 207 272

pulA 21 88 109 231

xylA 18 72 90 188

xylanase 60 67 127 221

Subtotal 878 972 1850 3192

Page 22: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Carbon fixation and methane metabolism

Gene/category Unique probe Group probe Total probe Total covered CDS

Carbon fixation

aclB 20 13 33 53

CODH 13 63 76 138

FTHFS 68 126 194 323

pcc 8 249 257 585

rubisco 139 146 285 515

Subtotal 248 597 845 1614

Methane metabolism

mcrA 104 106 210 392

mmoX 22 22 44 90

pmoA 85 39 124 270

Subtotal 211 167 378 752

Page 23: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Nitrogen cycling

Gene/category Unique probe Group probe Total probe Total covered CDS

Nitrogen cycling

amoA 100 95 195 528

gdh 26 19 45 94

hao 2 4 6 18

napA 11 22 33 83

narG 289 160 449 656

nasA 67 86 153 259

nifH 885 333 1218 2467

nirK 255 143 398 1005

nirS 351 155 506 923

norB 55 25 80 102

nosZ 191 119 310 596

ureC 57 218 275 603

Subtotal 2289 1379 3668 7334

Page 24: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Gene/category Unique probe Group probe Total probe Total covered CDS

Phosphorus

ppk 47 67 114 237

ppx 44 296 340 832

Subtotal 91 363 454 1069

Sulphur

dsrA 595 155 750 954

dsrB 371 131 502 685

sox 47 52 99 161

Subtotal 1013 338 1351 1800

Phosphorus utilization and sulphur

cycling

Page 25: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Metal reduction and resistance

Gene/category Total probe Total covered CDS

Metal reduction and resistanceArsenic resistance 396 803

Cadmium resistance 1254 2808

Chromium resistance 543 1292

Mercury resistance/reduction 292 594

Nickel resistance 42 88

Zinc resistance 1044 2197

Other metals and metalloids 1803 4135

Other metal reduction 413 449

Subtotal 5,787 12,366

Page 26: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Organic contaminant degradation

Gene/category Total probe Total covered CDS

Contaminant degradationBTEX and related aromatics 423 3084

Chloronated aromatics 250 473

Nitroaromatics 122 489

Heterocyclic aromatics 38 66

Hydrocarbons (e.g., PAHs) 179 2089

Chloronated solvents 180 360

Pesticides 1258 3083

Other chemicals and by-products 3936 7855

Subtotal 6,386 17,499

Page 27: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Energy-related metabolism processes

Gene/category Total probe Total covered CDS

Energy-related metabolism processesCytochromes 365 365

Hydrogenase 48 85

Subtotal 413 450

Page 28: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Two major types of applications

• Addressing site-specific research

questions:

Need understanding community diversity first.

• Profiling microbial communities as a

generic tool

Page 29: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Examples of most recent applications

• GroundwatersMonitoring bioremediation processes: Ur, Cr

Impacts of contaminants on microbial communities

• SoilsGrass land soils: effects of plant diversity and climate changes on soilcommunities

Agricultural soils: tillage, no tillage

Oil-contaminated soils

• Aquatic environmentsHydrothermal vents

Marine sediments

River sediments

• BioreactorsWastewater treatments

Biohydrogen

Microbial fuel cell

• Bioleaching

Page 30: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Applications to Bioremediation

• Oak RidgeImpacts of contaminants on groundwater microbialcommunity, natural attentuation

Responses of microbial communities to ethanol treatments

Microbial community composition in soil microcosmsduring U-reduction, U-oxidation, and mobilization phases

• HanfordImpacts of contaminants on microbial communities, 5samples from J. Fredickson

Responses of microbial communities to the addition ofHydrogen Release Compound (HRC)

• Old RifleIn-situ U(VI) bioremediation under sulfate-reducing andFe-reducing conditions

• Lake DepueMetal contamination

Page 31: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Overview of Microarray

Analysis

• DNA extraction from environmentalsamples, multiple samples, times

• Whole Community Rolling CircleAmplification (1-100ng DNA)

• Label DNA with Cy5

• Hybridization to GeoChip at 42, 45 or 50Cwith 50% formamide

• Data processing with automatic pipeline

• Statistical analysis

• Data interpretation

Page 32: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

(1) Vacuum strip

volatiles; chemically

neutralize acid and

precipitate metals

Ethanol addition

(2) Denitrify

water in

FBR

N2

U(VI) U(IV)

103 026 100, 101, 102 104 024Well

Water

Flow

Ab

ove g

rou

nd

tre

atm

en

t

(3) Inject

treated

water into

outer well

So

urc

e w

ate

r

Multi-level wells

12

3

415 m

2 m

Biostimulation of microbial populations for Ur removal

• Above ground denitrification and neutralization of groundwater

• in situ biostimulation with ethanol and reduction of U(VI)

Page 33: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Operational Periods During Bioremediation

Sampling for:

Microbial community

dynamic changes

Microbial community

stability

Microbial community

structure across all wells

885 992

0 68 136

137 184

185 529

Preconditioning

Normal operation

Stability, no DO

control

Stability

DO control

No DO control

U(VI) reduction

Denitification

530 637

638 712

713 754

806 884

755 805

Normal operation

day

Page 34: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Time (days)

191 212 226 233 248 255 622 641 719

Ui

(VI

M)

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

Total log

2 Signal Intensity/L

0

50

100

150

200

250

300

Uranium

cytochrome genes

USA EPA drinking

water maximum limit :

uranium (VI)<30 g/l

Relationship between uranium and c-type

cytochrome genes

• Uranium below drinking water standard

• Significant correlation between U and c-type cytochrome

Page 35: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Representative significant cytochrome c and

dsrAB genes

• Significant correlation between U and c-type cytochrome and dsrAB

• Environmental clones from the sites are highly abundant.

Page 36: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

35.2%

12

.2 %

0 3.5

01

.6

746754810

850

887

901

992

746

754

810

850 887

901

992No Ethanol

Ethanol

Detrended Correspondence AnalysisFW101-2

FW102-3

• Time points separate primarily based on ethanol injections. Cluster to the

right had not ethanol injections, the cluster on the left is after injections have

restarted.

• Both wells cluster together at each time point. Exceptions are days 850 and

887 (highlighted) – both during the reoxidation period. Well FW101 had

higher levels of DO than FW102 which would explain those differences.

Page 37: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Operational Periods During Bioremediation

Sampling for:

Microbial community

dynamic changes

Microbial community

stability

Microbial community

structure across all wells

885 992

0 68 136

137 184

185 529

Preconditioning

Normal operation

Stability, no DO

control

Stability

DO control

No DO control

U(VI) reduction

Denitification

530 637

638 712

713 754

806 884

755 805

Normal operation

day

Page 38: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Stanford-ORNL Research Experiment on in situ

Bioremediation of U(VI) Contaminated Sediments

Outer loopInner loop

Groundwater

flow

Outer loop

Injection:FW024

Extraction:

FW103

Inner loop

Injection: FW104

Extraction:

FW026

Monitoring wells

Level 2 (13.70m)

Level 3 (12.19m)

Level 4 (10.67m)

1 m

FW103FW026

FW104FW024

FW100

FW101

FW102

Page 39: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

-1.0 2.0

-0.6

1.0

FW024

FW104

FW026

FW103

FW100-2

FW100-3

FW100-4

FW101-2

FW101-3

FW101-4

FW102-2

FW102-3

FW102-4

36.4%

14.4

%

• Samples in inner wells are clustered together

• Outer wells together

Principal component analysis of

microbial community structure

Page 40: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

-1.0 2.0

-1.0

1.5

FW024

FW104FW026

FW103

FW100-2

FW100-3

FW100-4FW101-2

FW101-3

FW101-4

FW102-2

FW102-3

FW102-4

• Similar trends to community analysis

results

71.1%

16.6

%

Inner loop well

Outer loop well

Principal component analysis of Geochemical data

Page 41: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

pH NM

M

pH

Unexplained

NM

M

pH NM

pHM

MNM

A

Variation partitioning of microbial functional gene communities

into pH, nonmetal ion (NM) and metal ion (M) components.

A: General outline.

B: Entire microbial functional gene communities. NM: SO42-, HCO3

-, Cl-;

M: U, Na, K, Mg, Al, Mn.

B

8

12

19

17

44

2

15

Variation partition

Page 42: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

2005

Gallery

2004

Gallery

Injection wells

Injection wells

Fe-reducing

Sulfate-reducing

Page 43: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Results from Old Rifle

• DCA (Detrended Correspondence Analysis) of geochemistry

data

Ferrous iron, sulfide,

dissolved O2, pH,

conductivity, potential &

Eh are used for DCA.

Background samples

(B05 7/27 & 9/19) cluster

together; Iron-reducing

conditions samples (M16

9/5 & 9/19) form another

cluster; And when

condition is driven to

sulfate-reducing

condition, samples (M24

8/10 & M21 8/10) cluster

together.

M16 9/5

M16 9/19 M21 7/27

M21 8/10

M24 7/27

M24 8/10

B05 7/27

B05 9/19

0

0

0.2 0.4 0.6

0.10

0.20

Axis 1

A

x

i

s

Iron-reducing condition

Sulfate-reducing condition

Background

Iron-reducing condition

Page 44: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

• DCA of functional genes detected

by GeoChip

Community DCA result

closely reflects the

DCA results of

environmental

chemistry. It

demonstrates three

clusters of samples

under same

environmental

conditions:

background (B05 7/27

& 9/19), Iron-reducing

condition (M16 9/5 &

9/19), & Sulfate-

reducing condition

(M21 8/10 & M24 8/10).

M16 9/5/

M16 9/19

M21 7/27M21 8/10

M24 7/27

M24 8/10

B05 7/27

B05 9/19

0

0

40 80

40

80

Axis 1

A

x

i

s

Background

Iron-

reducing

condition

Sulfate-reducing condition

Page 45: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

• CCA ( Canonical Correspondence Analysis) of

environmental parameters & functional genes

Ferrous Iron is most

significant geochemistry

variable with community

structure. The CCA

model with six variables

is relatively strong

(P=0.1051).

Y-axis is defined almost

solely by ferrous iron.

With acetate injection,

iron-reducing sample

(M16) and sulfate-

reducing samples (M21 &

M24) appeared to be

strongly governed by

ferrous iron (as shown

with green arrow).

M16 9/5

M16 9/19M21 7/27

M21 8/10

M24 7/27

M24 8/10

B05 7/27

B05 9/19

ferrous

DO

pHconductivity

potential

Eh

0

0

10 20 30

10

20

30

Axis 1

Axis

Page 46: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

M16 9/5/06

M16 9/19/06

M21 7/27/06

M21 8/10/06

M24 7/27/06

M24 8/10/06

B05 7/27/06

B05 9/19/06

Ui

(VI

M

)

0.0

.2

.4

.6

.8

1.0

1.2

No

rma

lize

d S

ign

0

10

20

30

40

50

60

70

Uranium

Cytochrome genes

Relationships between uranium and

cytochrome gene abundance

The total abundance of c-type cytochrome was highly

coincident with the U(VI) concentrations (r=0.71, P<0.05)

r=0.71, P<0.05

Page 47: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

0

0.2

0.4

0.6

0.8

1

1.2

1.4

M16

9/5

/06

M16

9/1

9/06

M21

7/2

7/06

M21

8/1

0/06

M24

7/2

7/06

M24

8/1

0/06

B05

7/2

7/06

B05

9/1

9/06

Av

era

ge

Sig

na

l In

ten

sit

y

(No

rma

lize

d)

Geobacter

Desulfovibrio

The Mantel test was conducted to see the relationship between such genetic

patters and U(VI) concentrations, and significant correlations (P=0.016) were

observed.

Geobacter and Desulfovibrio are dominant

Page 48: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Experimental challenges and

future needs• Universality --- one platform

Probe representation

Only detect what put on the arrays

Sequences-based metagenomics

• Universal quantitative standards for comparison

Different times

Different experimental sites

Different labs

• Data analysis

Data processing

Modeling

Data simulation

Prediction

Page 49: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

454 sequencing

- PCR amplicons attached to beads(1amplicon/bead),

- Amplified by emulsion PCR, and

- Deposited in sequencing plate (1 bead/well)

Sequencing by synthesis (pyrosequencing)

Flowgram (~read)

400,000 reads per plate

250 bp reads for FLX system

100 bp reads for GS20 system

100MB/run in 4 hrs

Page 50: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

• Dimensionality problem

The gene number, N, is >> sample number, M.

• Reverse problem: dx/dt = f(X, E) = AX

Unlike typical networks such as transportation, network structure is known

For cellular network, we do not know the network structure, and we shouldfind matrix A.

• Linking community structure to functionsStatistical models

Mechanistic models

• Heterogeneous data: hybridization data, sequences, protein data

• Scaling: Simulating data at different organizational levels:DNA, RNA, proteins, metabolites, cells, tissues, organs, individuals,populations, communities, ecosystems, and biosphere

• Spatial scaleNanometers --- Meters --- Kilometers --- Thousands of kilometers

• Time scaleSeconds (molecules) --- hours (cells) --- Days (populations) --- Years(communities and ecosystems)

• Incorporation of microbial community dynamics into ecosystemmodels

Mathematical Challenges --- in terms of

modeling and simulation

Page 51: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

Acknowledgement

OU/ORNL

Zhili He

Liyou Wu

Ye Deng

Joy Van Nostrand

Meiying Xu

Sanghoon Kang

Terry Gentry

Chris Schadt

David Watson

Phil Jardine

Baohua Gu

Tony Palumbo

Stanford University

Craig Criddle

Weimin Wu

Michigan State University

James M. Tiedje

Mary Beth Liegh

Jim Cole

Dieter Tourlousse

LBL

Terry Hazen

University of Minnesota

Peter Reich

Central South University, China

Xueduan Liu

Harbin Institute of Technology, China

Aijie Wang

Page 52: From Community Structure to Functions: GeoChip Development ...€¦ · solid surface containing thousands of genes arrayed by automated equipment. • FGAs contain probes from the

• Specificity, sensitivity, quantitationWu et al. 2001; AEM:67: 5780-5790

Rhee et al. 2004, AEM 70:4303-4317

Tiquia et al. 2004. BioTechniques 36, 664-675

Wu et al. 2004; EST, 38: 6775-6782

He et al, 2007; The ISME J, 1: 67-77

He and Zhou, 2008, AEM, in press

• Probe design criteriaHe et al. 2005. AEM. 71:3753-3760

Liebich et al. AEM, 72:1688-1691

• New probe designing software: CommOligoLi et al. 2005. Nucl. Acids Res. 33:6114-6123

• Whole community genome amplification (WCGA)Wu et al. 2006. AEM: 72:4931-4941.

• Whole community RNA amplification (WCRA)Gao et al, 2007, AEM: 73: 563-571.

• Review:Gentry et al. 2006, Microbial Ecology, 52: 159-175.

Zhou and Thompson, 2002, Curr Opion Biotech: 13:204-207

Zhou, 2003; Curr Opion. Microbiol, 6:288-294

• ApplicationsHe et al, 2007; The ISME J, 1: 67-77

Leigh et al, 2007, The ISME J, 1: 163-179

Yergeau et al, 2007, The ISME J, 1: 134-148.

Zhou et al. 2008. PNAS, in press

Issues related to specificity, sensitivity and quantitation