Genome-scale reconstruction of the Lrp regulatory network in … · Genome-scale reconstruction of...

6
Genome-scale reconstruction of the Lrp regulatory network in Escherichia coli Byung-Kwan Cho, Christian L. Barrett, Eric M. Knight, Young Seoub Park, and Bernhard Ø. Palsson 1 Department of Bioengineering, University of California at San Diego, 9500 Gilman Dr., La Jolla, CA 92093-0412 Edited by Sankar Adhya, National Institutes of Health, Bethesda, MD, and approved October 14, 2008 (received for review July 24, 2008) Broad-acting transcription factors (TFs) in bacteria form regulons. Here, we present a 4-step method to fully reconstruct the leucine- responsive protein (Lrp) regulon in Escherichia coli K-12 MG 1655 that regulates nitrogen metabolism. Step 1 is composed of obtaining high-resolution ChIP-chip data for Lrp, the RNA polymerase and expression profiles under multiple environmental conditions. We identified 138 unique and reproducible Lrp-binding regions and classified their binding state under different conditions. In the second step, the analysis of these data revealed 6 distinct regulatory modes for individual ORFs. In the third step, we used the functional assign- ment of the regulated ORFs to reconstruct 4 types of regulatory network motifs around the metabolites that are affected by the corresponding gene products. In the fourth step, we determined how leucine, as a signaling molecule, shifts the regulatory motifs for particular metabolites. The physiological structure that emerges shows the regulatory motifs for different amino acid fall into the traditional classification of amino acid families, thus elucidating the structure and physiological functions of the Lrp-regulon. The same procedure can be applied to other broad-acting TFs, opening the way to full bottom-up reconstruction of the transcriptional regulatory network in bacterial cells. ChIP-chip transcription factor T ranscriptional regulatory systems often regulate the formation rates and the concentration of small molecules by 2 feedback loops that regulate the transporters and metabolic enzymes. In many cases, these 2 feedback loops are connected by a common transcription factor (TF) that senses the concentration of the small molecule (1). Little is known at present about the transition between the regulatory modes in the feedback loop motifs for global TFs in bacteria. One such transcription factor is the leucine- responsive protein (Lrp), which is a global transcription regulator widely distributed throughout the bacteria including Escherichia coli (2–4). The Lrp regulon includes genes involved in amino acid biosynthesis and degradation, small molecule transport, pili syn- thesis, and other cellular functions including 1-carbon metabolism (2, 4–6). The regulatory action of Lrp on target genes is often modulated by the binding of the small effector molecule leucine and in effect endows Lrp with the ability to affect transcriptional regulation in all possible ways. That is, upon addition of leucine to the environment, the activity of Lrp can be enhanced, reversed, or unaffected (2, 4, 7). Little is known about in vivo Lrp-binding events at the genome scale in the presence or absence of leucine and the extent to which the different modes of regulation are used for different metabolites. Such information is needed to reconstruct the Lrp regulon and the understanding of nitrogen metabolism. In this study, we applied a systems approach by integrating genome-scale data from chromatin immunoprecipitation followed by microarray hybridization (ChIP-chip) for Lrp and RNA poly- merase and from expression profiling to reconstruct the Lrp regulon. To achieve such reconstruction, we developed a 4-step process (Fig. 1). We first sought to comprehensively establish the Lrp-binding regions on the E. coli genome and any DNA sequence motif(s) correlated with the Lrp regulatory action. We measured the changes in RNA polymerase (RNAP) occupancies and mRNA transcript levels on a genome scale to determine the regulatory mode for each of the identified Lrp-binding regions under leucine- perturbed growth conditions. Second, we determined the regula- tory modes governed by Lrp. Third, this enabled us to identify logical motif structures composed of 2 feedback loops for trans- porting and metabolizing small molecules. Fourth, we could classify the amino acids and other metabolites in to groups that had the same regulatory network motifs, and how these motifs were sys- tematically shifted by the presence of leucine. The physiological role of the Lrp regulon is established through the reconstruction of its structure. Results Step 1: Identifying Lrp-Binding Regions and the Effects of Binding on Gene Expression. Four sets of experiments on a genome-wide scale were performed to achieve these goals; (i) determination of Lrp- binding regions, (ii) promoter-profiling using rifampicin-treated cells, (iii) measurements of RNAP rearrangement, and (iv) mea- surements of changes in mRNA transcripts. Determination of Lrp-binding regions on a genome-wide scale. Lrp has been extensively characterized by in vitro DNA-binding experi- ments and in vivo mutational analysis; however, direct analysis of in vivo Lrp binding has not been fully described (2, 4, 8). Here, we employ the ChIP-chip approach to determine the in vivo Lrp- binding regions in E. coli cells growing in minimal media in exponential phase in the presence and absence of exogenous leucine and in stationary phase. Before microarray hybridization, we used quantitative PCR (qPCR) to validate the enrichment fold of the previously characterized Lrp-binding regions with ilvIH (9), gcvTHP (10, 11), and gltBDF operons (12) in immunoprecipitated DNA (IP-DNA) obtained from the strain harboring 8 myc-tagged Lrp protein (13). The qPCR results demonstrate that Lrp-bound DNA fragments were selectively immunoprecipitated from the growing E. coli cells under the conditions (Fig. S1). To determine the genome-wide Lrp-binding regions, we next performed a hybridization of the IP-DNA (Cy5 channel) and mock IP-DNA (Cy3 channel) onto the high-resolution whole-genome tiling microarrays, which contained a total of 371,034 oligonucle- otides with 50-bp tiles overlapping every 25-bp on both forward and reverse strands (14). The normalized log 2 ratios obtained from the hybridization identify the genomic regions enriched in the IP-DNA sample compared with the mock IP-DNA sample and thereby represent a genome-wide map of in vivo interactions between Lrp protein and E. coli genome (Fig. 2A). The genome-wide Lrp- binding maps obtained from 3 different conditions, i.e., exponential growth phase in the presence (Fig. 2 Ai) and the absence (Fig. 2 Aii) of exogenous leucine and stationary growth phase in the absence of leucine (Fig. 2 Aiii) indicated that the Lrp association on the E. coli Author contributions: B.-K.C. and B.Ø.P. designed research; B.-K.C., E.M.K., and Y.S.P. performed research; B.-K.C. contributed new reagents/analytic tools; B.-K.C., C.L.B., and B.Ø.P. analyzed data; and B.-K.C., C.L.B., and B.Ø.P. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. 1 To whom correspondence should be addressed. E-mail: [email protected]. This article contains supporting information online at www.pnas.org/cgi/content/full/ 0807227105/DCSupplemental. © 2008 by The National Academy of Sciences of the USA 19462–19467 PNAS December 9, 2008 vol. 105 no. 49 www.pnas.orgcgidoi10.1073pnas.0807227105 Downloaded by guest on August 13, 2020

Transcript of Genome-scale reconstruction of the Lrp regulatory network in … · Genome-scale reconstruction of...

Page 1: Genome-scale reconstruction of the Lrp regulatory network in … · Genome-scale reconstruction of the Lrp regulatory network in Escherichia coli Byung-Kwan Cho, Christian L. Barrett,

Genome-scale reconstruction of the Lrp regulatorynetwork in Escherichia coliByung-Kwan Cho, Christian L. Barrett, Eric M. Knight, Young Seoub Park, and Bernhard Ø. Palsson1

Department of Bioengineering, University of California at San Diego, 9500 Gilman Dr., La Jolla, CA 92093-0412

Edited by Sankar Adhya, National Institutes of Health, Bethesda, MD, and approved October 14, 2008 (received for review July 24, 2008)

Broad-acting transcription factors (TFs) in bacteria form regulons.Here, we present a 4-step method to fully reconstruct the leucine-responsive protein (Lrp) regulon in Escherichia coli K-12 MG 1655 thatregulates nitrogen metabolism. Step 1 is composed of obtaininghigh-resolution ChIP-chip data for Lrp, the RNA polymerase andexpression profiles under multiple environmental conditions. Weidentified 138 unique and reproducible Lrp-binding regions andclassified their binding state under different conditions. In the secondstep, the analysis of these data revealed 6 distinct regulatory modesfor individual ORFs. In the third step, we used the functional assign-ment of the regulated ORFs to reconstruct 4 types of regulatorynetwork motifs around the metabolites that are affected by thecorresponding gene products. In the fourth step, we determined howleucine, as a signaling molecule, shifts the regulatory motifs forparticular metabolites. The physiological structure that emergesshows the regulatory motifs for different amino acid fall into thetraditional classification of amino acid families, thus elucidating thestructure and physiological functions of the Lrp-regulon. The sameprocedure can be applied to other broad-acting TFs, opening the wayto full bottom-up reconstruction of the transcriptional regulatorynetwork in bacterial cells.

ChIP-chip � transcription factor

Transcriptional regulatory systems often regulate the formationrates and the concentration of small molecules by 2 feedback

loops that regulate the transporters and metabolic enzymes. Inmany cases, these 2 feedback loops are connected by a commontranscription factor (TF) that senses the concentration of the smallmolecule (1). Little is known at present about the transitionbetween the regulatory modes in the feedback loop motifs forglobal TFs in bacteria. One such transcription factor is the leucine-responsive protein (Lrp), which is a global transcription regulatorwidely distributed throughout the bacteria including Escherichiacoli (2–4). The Lrp regulon includes genes involved in amino acidbiosynthesis and degradation, small molecule transport, pili syn-thesis, and other cellular functions including 1-carbon metabolism(2, 4–6). The regulatory action of Lrp on target genes is oftenmodulated by the binding of the small effector molecule leucine andin effect endows Lrp with the ability to affect transcriptionalregulation in all possible ways. That is, upon addition of leucine tothe environment, the activity of Lrp can be enhanced, reversed, orunaffected (2, 4, 7). Little is known about in vivo Lrp-binding eventsat the genome scale in the presence or absence of leucine and theextent to which the different modes of regulation are used fordifferent metabolites. Such information is needed to reconstructthe Lrp regulon and the understanding of nitrogen metabolism.

In this study, we applied a systems approach by integratinggenome-scale data from chromatin immunoprecipitation followedby microarray hybridization (ChIP-chip) for Lrp and RNA poly-merase and from expression profiling to reconstruct the Lrpregulon. To achieve such reconstruction, we developed a 4-stepprocess (Fig. 1). We first sought to comprehensively establish theLrp-binding regions on the E. coli genome and any DNA sequencemotif(s) correlated with the Lrp regulatory action. We measuredthe changes in RNA polymerase (RNAP) occupancies and mRNAtranscript levels on a genome scale to determine the regulatory

mode for each of the identified Lrp-binding regions under leucine-perturbed growth conditions. Second, we determined the regula-tory modes governed by Lrp. Third, this enabled us to identifylogical motif structures composed of 2 feedback loops for trans-porting and metabolizing small molecules. Fourth, we could classifythe amino acids and other metabolites in to groups that had thesame regulatory network motifs, and how these motifs were sys-tematically shifted by the presence of leucine. The physiological roleof the Lrp regulon is established through the reconstruction of itsstructure.

ResultsStep 1: Identifying Lrp-Binding Regions and the Effects of Binding onGene Expression. Four sets of experiments on a genome-wide scalewere performed to achieve these goals; (i) determination of Lrp-binding regions, (ii) promoter-profiling using rifampicin-treatedcells, (iii) measurements of RNAP rearrangement, and (iv) mea-surements of changes in mRNA transcripts.Determination of Lrp-binding regions on a genome-wide scale. Lrp hasbeen extensively characterized by in vitro DNA-binding experi-ments and in vivo mutational analysis; however, direct analysis of invivo Lrp binding has not been fully described (2, 4, 8). Here, weemploy the ChIP-chip approach to determine the in vivo Lrp-binding regions in E. coli cells growing in minimal media inexponential phase in the presence and absence of exogenousleucine and in stationary phase. Before microarray hybridization,we used quantitative PCR (qPCR) to validate the enrichment foldof the previously characterized Lrp-binding regions with ilvIH (9),gcvTHP (10, 11), and gltBDF operons (12) in immunoprecipitatedDNA (IP-DNA) obtained from the strain harboring 8 �myc-taggedLrp protein (13). The qPCR results demonstrate that Lrp-boundDNA fragments were selectively immunoprecipitated from thegrowing E. coli cells under the conditions (Fig. S1).

To determine the genome-wide Lrp-binding regions, we nextperformed a hybridization of the IP-DNA (Cy5 channel) and mockIP-DNA (Cy3 channel) onto the high-resolution whole-genometiling microarrays, which contained a total of 371,034 oligonucle-otides with 50-bp tiles overlapping every 25-bp on both forward andreverse strands (14). The normalized log2 ratios obtained from thehybridization identify the genomic regions enriched in the IP-DNAsample compared with the mock IP-DNA sample and therebyrepresent a genome-wide map of in vivo interactions between Lrpprotein and E. coli genome (Fig. 2A). The genome-wide Lrp-binding maps obtained from 3 different conditions, i.e., exponentialgrowth phase in the presence (Fig. 2Ai) and the absence (Fig. 2Aii)of exogenous leucine and stationary growth phase in the absence ofleucine (Fig. 2Aiii) indicated that the Lrp association on the E. coli

Author contributions: B.-K.C. and B.Ø.P. designed research; B.-K.C., E.M.K., and Y.S.P.performed research; B.-K.C. contributed new reagents/analytic tools; B.-K.C., C.L.B., andB.Ø.P. analyzed data; and B.-K.C., C.L.B., and B.Ø.P. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

1To whom correspondence should be addressed. E-mail: [email protected].

This article contains supporting information online at www.pnas.org/cgi/content/full/0807227105/DCSupplemental.

© 2008 by The National Academy of Sciences of the USA

19462–19467 � PNAS � December 9, 2008 � vol. 105 � no. 49 www.pnas.org�cgi�doi�10.1073�pnas.0807227105

Dow

nloa

ded

by g

uest

on

Aug

ust 1

3, 2

020

Page 2: Genome-scale reconstruction of the Lrp regulatory network in … · Genome-scale reconstruction of the Lrp regulatory network in Escherichia coli Byung-Kwan Cho, Christian L. Barrett,

genome is dramatically sensitive to the nutrient richness. Using apeak detection algorithm based on the double-regression model(15), 34, 92, and 134 unique and reproducible Lrp-binding regionswere identified from the hybridizations in exponential phase in thepresence and absence of leucine and in stationary phase in theabsence of leucine, respectively (Tables S1 and S2).Identification and sequence analysis of Lrp-binding regions. Among atotal of 138 Lrp-binding regions, 29 regions (21%) were boundunder all 3 conditions examined (Fig. 2B). As expected, the Lrpassociations with target regions were very sensitive to the additionof exogenous leucine, as judged by the number of Lrp-bindingregions found under conditions in the absence and presence ofexogenous leucine. Only 30% of binding sites (29 of 97) overlappedunder the conditions in the absence and presence of exogenousleucine, and 65% of binding sites (63 of 97) were newly found in theabsence of exogenous leucine. A total of 41 Lrp-binding regionswere newly identified in the IP-DNA sample obtained from thestationary condition, supporting the hypothesis that Lrp plays animportant role in transcriptional regulation under stationarygrowth conditions (16). Before this study, 23 Lrp-binding sites hadbeen characterized by DNA-binding experiments in vitro andmutational analysis in vivo, 74% (17 of 23) of which were identifiedin this study (Fig. 2 A and C, Tables S1 and S2). The exceptions werelrp, osmC, ompC, micF, aidB, and csiD promoters, whose functions

are related to responses to osmotic stress and stationary growthphase. To determine whether the failure to detect Lrp binding atthese 6 sites was due to the sensitivity of the microarrays, weperformed conventional ChIP assays followed by qPCR analysisand confirmed the absence of Lrp binding in those regions underthe experimental conditions used in this study. Furthermore, theChIP-chip results were experimentally confirmed by using qPCR onarbitrarily selected sites from the 138 Lrp-binding regions (gltB, ilvI,gcvT, kbl, leuE, dadA, yffQ, C0719, sbcD, fimA, tppB, ftsQ, lysU,brnQ, sdaA, oppA, stpA, dppA, livJ, ydeN, and ynfF) and 3 controlregions (dmsA, sdhC, and paaZ). All of the selected Lrp-bindingregions exhibited enrichment as a log2 ratio range of 1.5 �5.1,whereas the 3 control regions showed no significant enrichment(Fig. S1). On the basis of this analysis, we concluded that all ornearly all Lrp-binding regions identified here are bona fide bindingsites.

We next assessed the locations of the Lrp-binding regions againstthe current annotated genome information (Genbank accessionnumber NC�000913). The Lrp-binding regions were observed notonly within intergenic (i.e., promoter and promoter-like) regionsbut also within intragenic (i.e., ORF) and convergent regions (i.e.,intergenic region between 3�-ends of convergently transcribedgenes) (Fig. S2). Judging from the fact that �10% of the E. coligenome is noncoding, our results indicate there is a strong prefer-ence for Lrp-binding targets to be located within the noncodingintergenic regions. To identify common DNA sequence motifs ofthe Lrp-binding regions, we developed an algorithm using both theDNA-binding position weight matrix (PWM) and the spacingbetween binding sites. The identified 15-bp sequence motif isstructured with flanking CAG/CTG triplets and a central AT-richsignal that together are reminiscent of DNA sequence character-istics important for nucleosome positioning and stability (Fig. 2D,Fig. S2). We were not able to detect different LRP-binding motifsfor subsets of LRP-binding regions corresponding to different LRPregulatory modes (see below.)Genome-wide rearrangement of RNAP association. To gain a bettermechanistic understanding of the transcriptional regulatory roles ofLrp in response to exogenous leucine, we measured the RNAPoccupancy on a genome scale to identify locations where RNAPoccupancy is increased or decreased because of changes in Lrp-binding levels and exogenous leucine levels (see table available athttp://systemsbiology.ucsd.edu/publications). In parallel, to identifythe promoter regions on a genome scale, the cells were treated withrifampicin to inhibit the transcription elongation step (17). TheRNAP-binding patterns obtained from rifampicin-treated cells (seetable available at http://systemsbiology.ucsd.edu/publications) showsimilar patterns between the 2 conditions; however, those fromrifampicin nontreated cells indicate differential RNAP binding,representing the genome-wide rearrangement of RNAP caused bythe exogenous leucine. Fig. 3 shows examples of the differentialbindings of RNAP and Lrp on the gcvTHP, serA, oppABCDF, andsdaA operons in response to exogenous leucine. For example, thedecrease in RNAP occupancy at the serA locus was observedbecause of the exogenous leucine. At the same time, the conse-quence of leucine addition was the sharp reduction of the Lrp-binding levels at the promoter region of serA. In this case, the roleof Lrp is that of an activator and the exogenous leucine serve as anantagonist to repress the expression of the serA (Fig. 3A). However,exogenous leucine represses the binding of Lrp, whose role is thatof an inhibitor of RNAP binding to the oppABCDF and sdaAoperons (a repressor in these cases), so that the transcription of thegenes in the operons becomes induced in response to the exogenousleucine (Fig. 3 B and C).Measurements of changes in mRNA transcripts. Next, we examined theeffects of exogenous leucine on the changes in the mRNA transcriptlevels of the exponentially growing cells using Affymetrix Gene-Chip E. coli Genome 2.0 arrays. To begin with, 629 genes weredifferentially expressed in response to the exogenous leucine with

Fig. 1. Overview of the method. (A) Comprehensive establishment of theLrp-binding regions along with the changes in RNAP occupancies and mRNAtranscript levels on a genome-scale to determine the regulatory mode for eachof the identified Lrp-binding regions under leucine-perturbed growth condi-tions. (B) Determination of regulatory mode for individual ORFs governed byLrp. (C) Reconstruction of regulatory network motif to identify logical motifstructures composed of 2 feedback loops for transporting and metabolizingsmall molecules. (D) Understanding physiological behavior of regulatory net-work motifs.

Cho et al. PNAS � December 9, 2008 � vol. 105 � no. 49 � 19463

MIC

ROBI

OLO

GY

Dow

nloa

ded

by g

uest

on

Aug

ust 1

3, 2

020

Page 3: Genome-scale reconstruction of the Lrp regulatory network in … · Genome-scale reconstruction of the Lrp regulatory network in Escherichia coli Byung-Kwan Cho, Christian L. Barrett,

P value �0.05 and log2 ratio �0.5 (see table available at http://systemsbiology.ucsd.edu/publications). Then, we compared thechanges in mRNA transcript levels with the differential RNAP-binding levels. The differential RNAP-binding levels are defined asthe difference between the sums of RNAP-binding levels across allprobes in a targeted gene’s ORF under the 2 conditions. In theprevious report (14), there was very little correlation between thelog2 ratio of RNAP-binding peaks obtained from the rifampicin-treated cells and the expression of the nearest downstream ORF.However, we observed that the changes in RNAP-binding levels arecorrelated with the changes in the mRNA transcript levels inresponse to the exogenous leucine (Fig. S3).

Given that Lrp differentially binds at least 91 promoter regions(�194 genes) within exponentially growing cells in the absence andpresence of exogenous leucine, we expected the deletion of lrp toresult in a substantial effect on the global gene expression patterns

during the exponential growth phase in the absence and presenceof exogenous leucine. Statistical analysis of the gene expressionprofiles of a parental (MG1655) and an lrp deletion strains basedon the 2-way ANOVA analysis (P value �0.005) shows that the Lrpdirectly and indirectly regulates �52% (�330 genes) of differen-tially expressed genes in response to the exogenous leucine (seetable available at http://systemsbiology.ucsd.edu/publications). Thegenes whose transcription levels we assumed to be controlled by thepromoters directly bound by Lrp are summarized in Table S3.

Step 2: Regulatory Modes of Lrp in Response to Exogenous Leucine.The most interesting aspect of the regulatory modes of Lrp inresponse to exogenous leucine is its variable response; in somecases, exogenous leucine has no effect on the action of Lrp; forothers, it increases the effect of Lrp, and for others it reverses theeffect of Lrp (7). We reconfirmed by qPCR the changes in the

Fig. 2. Genome-wide distribution of Lrp-binding regions. (A) An overview of Lrp-binding profiles across the E. coli genome at exponential growth phase inthe presence (i) or absence (ii) of leucine and at stationary growth phase (iii). Enrichment fold on the y axis was calculated from Cy5 (IP-DNA) and Cy3 (mockIP-DNA) signal intensity of each probe and plotted against each location on the 4.64 Mb E. coli genome. (B) Overlaps between Lrp-binding regions of exponentialphase in the absence and presence of leucine, and stationary phase. (C) Comparison of the Lrp-binding regions obtained from this study (ChIP-chip) with theliterature information. (D) Sequence logo representation of the learned Lrp-DNA binding profile.

Fig. 3. Genome-wide rearrangement of RNAP occupancy. Upper and Lower show changes in RNAP- and Lrp-binding signals on the selected regions, (A) gcvTHPand serA, (B) oppABCDF, and (C) sdaA, respectively. Green (solid and dotted) lines and red (solid and dotted) lines in both Upper and Lower indicate theexponential growth condition in the absence and presence of the exogenous leucine, respectively. The dotted red and green lines in Upper indicate the RNAPbinding on the promoter regions obtained from the rifampicin-treated cells. The triangle indicates the Lrp and RNAP binding location.

19464 � www.pnas.org�cgi�doi�10.1073�pnas.0807227105 Cho et al.

Dow

nloa

ded

by g

uest

on

Aug

ust 1

3, 2

020

Page 4: Genome-scale reconstruction of the Lrp regulatory network in … · Genome-scale reconstruction of the Lrp regulatory network in Escherichia coli Byung-Kwan Cho, Christian L. Barrett,

mRNA transcript levels of the selected Lrp-binding regions (36regions) to use genome-wide expression profiling to define theregulatory modes of Lrp in response to exogenous leucine (Fig. S3).The results agreed well with the gene expression profiling data. Fig.4A illustrates examples of changes in the levels of Lrp associationat the promoters of gcvTHP, ftsQAZ, fimAICDFGH, livKHMGF,serA, and oppABCDF and changes in the levels of mRNA tran-scripts of the genes under the presence (L) and absence (E) ofexogenous leucine. The Lrp associations at the promoter regions ofgcvTHP and ftsQAZ were not strongly affected by the addition ofexogenous leucine (Fig. 4A i and ii), strongly supporting theobservation that exogenous leucine has no effect on the transcrip-tional regulatory action of Lrp at these promoters. To confirm theunique transcriptional regulation mediated by Lrp, we next mea-sured the level of mRNA transcripts of gcvTHP and ftsQAZ operonby qPCR. As expected, the addition of exogenous leucine had noeffect on the level of mRNA transcripts (Fig. 4A i and ii). Therefore,the first category of Lrp regulatory modes was denoted by theindependent mode as shown in Fig. 4Bi. In this category, there areftsQAZ and gcvTHP operons. In contrast, the Lrp associations at thepromoter regions of fimAICDFGH and livKHMGF, which areknown to be activated and repressed by Lrp, respectively, showedslight changes in the Lrp-binding levels in the presence of exogenousleucine (18, 19). However, the exogenous leucine strongly stimu-lated the changes in the levels of mRNA transcript levels of thoseoperons (Fig. 4A iii and iv). Because the exogenous leucinestimulates the effect of Lrp bindings on the promoters, the secondcategory was denoted by the concerted mode (Fig. 4Bii). Last, formost Lrp-regulated promoters identified (74 regions), the exoge-nous leucine relieved the effect of Lrp in vivo (Fig. 4Biii). Forexample, Lrp activates and represses the transcription of serA andoppABCDF operon in the absence of exogenous leucine, respec-tively (20, 21), and exogenous leucine caused the strong repressionof the Lrp-bindings on those promoter regions (Fig. 4A v and vi).This third category was denoted by the reciprocal mode.

Step 3: Lrp-Regulated Feedback Loop Motifs. We next functionallyclassified the �236 genes directly regulated by Lrp and found thatthe products of many of the genes are known or predicted tofunction in a wide range of cellular processes and molecular

functions, which included amino acid biosynthesis, amino aciddegradation, nutrient transport, and synthesis of fimbriae (Fig. S4).We noticed that the functions of 45% (�118 genes) of those genesare mainly localized to the small molecule transport and metabo-lism. Fig. 5A shows the integrative analysis of Lrp-binding profiles,RNAP occupancy profiles, and mRNA transcript levels of theselected transporters and metabolic enzymes for amino acids. In thecase of transporters, the concerted and reciprocal regulatory modesexist in response to the exogenous leucine. Specifically, the trans-porters for branched-chain amino acids (brnQ, ilvKHMGF, andilvJHMGF) are regulated by concerted mode (Fig. 4Biv), whereastransporters for arginine, serine, alanine, and proline (artMQIP,sdaC, cycA, and proY, respectively) regulated by reciprocal mode(Fig. 4vi). Interestingly, Lrp reciprocally regulates the aromaticamino acid transporters (tyrP and mtr) and indirectly regulates thegeneral aromatic amino acid transporter (aroP) via TyrR transcrip-tion factor (i.e., Lrp directly regulates the TyrR by reciprocal mode).However, Lrp regulates the metabolic enzymes such as ilvE, tdcB,sdaA, and dadA through only reciprocal mode.

Step 4: Elucidation of the Function of the LRP Regulon. From thisanalysis, we were able to connect transport (Uptake) and metabolic(Use) feedback loop pairs (Fig. 5Bi and Table S4) and characterizethem by 1 of 4 possible combinations of feedback loop motifs (1).In the left loop, Lrp regulates transcription of the transport protein(T), facilitating the uptake of the small molecule (Xout), whereas inthe right loop, Lrp controls transcription of metabolic enzymes (E)responsible for transforming Xin into Y (i.e., metabolites). The logicof the coupled loop motifs can be described by a notation that uses2 signs (Fig. 5Bii). For example, the AA(�/�) loop motif indicatesthat the transcription of both transport and metabolic genes areactivated, whereas the RR(�/�) motif demonstrates that thetranscription of both genes are repressed. The possible logicalstructures of the feedback loop motifs can be characterized de-pending on how Lrp (or Lrp-leucine complex) activates or repressesboth T and E in response to the exogenous leucine: homeostasis(�/�), nutrition (�/�), flow homeostasis (�/�), and accumulation(�/�) (1). Based on the feedback loop motifs, we analyzed thebehavior of logical structures of the transporters and metabolicenzymes in response to the exogenous leucine (Table S4). For

Fig. 4. Regulatory modes of individual ORFs governed by Lrp in response to exogenous leucine. (A) Examples of independent mode [(I) gcvTHP and (II) ftsQAZ],concerted mode [(III) fimAICDFGH and (IV) livKHMGF], and reciprocal mode [(V) serA and (VI) oppABCDF]. Multiple lines indicate the Lrp-binding signals fromthe biological replicates. Green and red lines indicate the exponential growth condition in the absence and presence of the exogenous leucine, respectively. (B)Classification of the Lrp regulatory modes based on the gene expression profiling and location analysis of Lrp and RNAP.

Cho et al. PNAS � December 9, 2008 � vol. 105 � no. 49 � 19465

MIC

ROBI

OLO

GY

Dow

nloa

ded

by g

uest

on

Aug

ust 1

3, 2

020

Page 5: Genome-scale reconstruction of the Lrp regulatory network in … · Genome-scale reconstruction of the Lrp regulatory network in Escherichia coli Byung-Kwan Cho, Christian L. Barrett,

example, there are 3 possible T-E combinations between branched-chain amino acids transporters (brnQ, livKHMGF, and livJHMGF)and 1 metabolic enzyme (ilvE) in E. coli. The combined feedbackloop motifs for the branched-chain amino acids indicate theRR(�/�) logical structure in the absence of the exogenous leucine,whereas those are switched to RA(�/�) in exposing to exogenousleucine (Fig. 5C). However, the combined feedback loop motifs inthe T-E combinations for the aromatic amino acids show thetransition between RA(�/�) and AR(�/�) logical structures. Inthe end, we classified the logical structures of the feedback loopmotifs into 3 categories based on the behavior of logical structuresin response to the exogenous leucine (Fig. 5C).

DiscussionWe reconstructed the Lrp regulon in E. coli by combining genome-scale location analysis, promoter-profiling using rifampicin-treatedcells, RNAP occupancy profiles, and gene expression data. Weidentified (i) regulatory modes for individual genes in the Lrpregulon, (ii) the behavior of the logical structures in the Lrp-regulatory feedback loop motifs composed of transporters andmetabolic enzymes in response to the exogenous leucine, and (iii)the overall structure of the Lrp regulon and how it regulates themetabolism of families of amino acids and other metabolites.

The genome-wide maps of Lrp-binding regions presented herenot only confirm the previously characterized Lrp-binding regions(�17 regions) but also increase the number of known sites (�138regions) by a factor of 5. From the genome-wide mapping results,we were also able to show that: (i) A total of 138 Lrp-binding regionswere identified, �84% of which were located within noncodingregions, whereas the remaining �16% were found within codingregions; (ii) 34 and 92 Lrp-binding regions were identified inexponentially growing cells in the presence and absence of exoge-nous leucine, respectively, indicating that Lrp bindings to the E. coligenome are dramatically sensitive to the addition of exogenousleucine; and (iii) the Lrp-binding sites on the E. coli genome understationary growth condition (134 Lrp-binding regions identified)indicated that Lrp plays pivotal roles in the transcriptional regula-tion under stationary growth conditions. The high number ofLrp-binding regions was not surprising, given that global transcrip-

tion factors such as Fnr, Crp, Ihf, Fis, and Hns specifically bind tobetween �63 and �224 target regions (22–24).

The genome-wide location analysis showed Lrp binding at 17intragenic regions within ORFs and at 2 intergenic regions adjacentto convergently transcribed genes where current genomic annota-tion did not indicate a possible promoter. These Lrp-bindingregions may indicate that genomic features have not been yetdiscovered, such as promoters of novel genes, or that transcriptionfactor–DNA interactions occur by chance and have not beenremoved by evolution (25). As more experiments of this type areperformed and as more functions are assigned to the gene productsof hypothetical ORFs, an even clearer picture of the Lrp-bindingregions in E. coli will emerge. It is indicative of the power of theChIP-chip approach that four-fifths of the Lrp-binding regionsobserved here were previously unknown. Furthermore, the physi-ological functions of nearly one-third of directly Lrp-regulatedgenes are currently unknown, most of which (�98%) are likely tobe regulated by Lrp under the stationary growth condition. There-fore, much of this regulation can be understood by the principle of‘‘feast or famine’’ adaptation for survival in nutrient-rich or de-pleted environments (16). Although the physiological functions ofthe genes are currently unknown, the results presented here supportprevious suggestions that the physiological role of Lrp is to monitorthe nutritional state of the cell to adjust its metabolism to changingnutritional conditions and, in cooperation with other regulatorynetworks such as alternative sigma factor �S, to coordinate thesechanges with the physical environment of the cell.

In general, the global transcription factors located at the higherlevel of hierarchy in the transcriptional regulatory network havemany direct regulatory targets that are transcription factors locatedin the lower levels of the transcriptional regulatory network hier-archy (26). We have identified 11 such transcription factors that areunder Lrp control. They are DhaR, CysB, TyrR, SlyA, EutR, TdcA,GadW, LrhA, DeoT, YkgK, and Yhe, the latter 2 of which arepredicted regulatory proteins. Most of these transcription factorsparticipate in the regulation of the amino acid metabolism andsmall molecule transport. These regulatory proteins altogether areknown to regulate the transcription of at least 34 genes. There arelikely to be additional genes indirectly regulated by Lrp, such as

Fig. 5. Regulatory network motif reconstruction and the behavior of feedback loop motifs in response to the exogenous leucine. (A) Data integration betweenLrp binding, RNAP occupancy, and gene expression profiles on the selected Lrp-regulated target genes (transporters and metabolic enzymes). (B) Schematicdiagram for the regulatory network motif reconstruction in feedback loop. (C) Behavior of the feedback loop motifs.

19466 � www.pnas.org�cgi�doi�10.1073�pnas.0807227105 Cho et al.

Dow

nloa

ded

by g

uest

on

Aug

ust 1

3, 2

020

Page 6: Genome-scale reconstruction of the Lrp regulatory network in … · Genome-scale reconstruction of the Lrp regulatory network in Escherichia coli Byung-Kwan Cho, Christian L. Barrett,

genes regulated by metabolites produced by the metabolic enzymesin the regulatory interaction. Therefore, the experimental datapresented here support the previous suggestion that the size of Lrpregulon is �10% of all ORFs in E. coli (16, 27).

The data integration of Lrp-binding information with the geneexpression profiles, promoter profiling using rifampicin-treatedcells, and RNAP occupancy profiles has enabled us to elucidate afuller mechanistic understanding of the differential Lrp-bindingprofiles in response to exogenous leucine. We were able to showthat (i) the differential Lrp-binding profiles in response to theexogenous leucine described 3 unique regulatory modes (indepen-dent, concerted, and reciprocal); (ii) the functional classification ofgenes directly regulated by Lrp represents the diverse roles of Lrpin the E. coli metabolism and the 45% of genes in Lrp regulon tothe small molecule transport and metabolism; (iii) the feedbackloop motifs composed of transporters and metabolic enzymes canbe reconstructed based on the unique regulatory modes governedby Lrp and the functional localization of the genes; and (iv) havingdescribed the behavior of the feedback loop motifs in response toexogenous leucine, we finally show that the 3 regulatory circuits forcontrolling small molecules uptake and utilization by the globaltranscription factor Lrp.

In summary, we have described an integrative analysis of varioustypes of genome-scale data to comprehensively understand thedesign principles of a global transcription factor, Lrp in E. coli. Inthe future, this systems approach will enable us to derive a similarunderstanding for how broad-acting transcription factors coordi-nate their activities to arrive at a functional organism.

Materials and MethodsBacterial Strains, Media, and Growth Conditions. E. coli BOP508 cells harboringLrp-8�myc, BOP�508 deleted for lrp, and MG1655 were grown in glucose (2 g/L)minimal M9 medium supplemented with or without 10 mM leucine. Glycerolstocksof theE.coli strainswere inoculated intotheminimalmediumandculturedat 37 °C with constant agitation overnight. The cultures were diluted 1:100 into50 mL of the fresh minimal medium and then cultured at 37 °C to an appropriatecell density. For the rifampicin-treated cells, the rifampicin dissolved in methanolwas added to a final concentration of 150 �g/mL and stirred for 20 min. Cultureswere monitored by OD600 nm to verify the inhibitory effects of rifampicin.

ChIP-Chip. Culturesatmidlogorstationarygrowthphasewerecross-linkedby1%formaldehyde at room temperature for 25 min. After cell lysis and sonication, thecross-linked DNA-Lrp and DNA-RNAP complexes were immunoprecipitated byusing antibody against myc-tag and RNA polymerase � subunit (rpoB), respec-tively, and Dynabeads Pan Mouse IgG magnetic beads (Invitrogen) followed bystringent washings (see SI Text for the detailed ChIP-chip protocol). After reversalof the cross-links by incubation at 65 °C overnight, the samples were treated byRNaseA (Qiagen) and proteaseK (Invitrogen) and then purified with a PCRpurification kit (Qiagen). To verify the enrichment of the Lrp-binding regions inthe DNA samples, 1 �L of IP or mock-IP DNA was used to perform gene-specificreal-time qPCR with the specific primers to the promoter regions. The IP andmock-IP DNA were then amplified by random DNA amplification method (14).Then, the amplified DNA samples were labeled and hybridized onto whole-genome-tiled microarrays (NimbleGen). Detailed methods used for ChIP-chipanalysis is described in SI Text.

Transcriptional Analysis. Cultures were grown to midlog growth phase aerobi-cally (OD A600� 0.6). The cultures (3 mL) were then added to 2 volumes ofRNAprotect Bacteria Reagent (Qiagen) and total RNA was isolated by usingRNAeasy columns (Qiagen) with DNaseI treatment. Total RNA yields were mea-sured by using a spectrophotometer (A260) and quality was checked by visualiza-tion on agarose gels and by measuring the sample A260/A280 ratio (�1.8). cDNApreparation was performed as described in ref. 13. Each qPCR contained 0.5 �Mof each forward and reverse primer (SI Text for the detailed qPCR protocol), 150ng of cDNA, and 25 �L of SYBR Master Mix (Qiagen). Affymetrix GeneChip E. coliGenome 2.0 arrays were used for genome-scale transcriptional analyses. cDNAsynthesis, fragmentation, end-terminus biotin labeling, and array hybridizationwere performed as recommended by Affymetrix standard protocol.

Data Analysis. To identify enriched regions in the ChIP-chip data, we used thepreviously developed peak-finding algorithm (15). For the Lrp-binding site pro-file, thedetailsofthemanner inwhichweidentifiedhigh-probabilityLrp-bindingregions using ChIP-chip peaks and of the algorithm that we developed to learnthe Lrp DNA-binding signals are contained in SI Text.

Raw Experimental Data. SI Text and all raw data files can be downloaded fromhttp://systemsbiology.ucsd.edu/publications.

ACKNOWLEDGMENTS. Thisworkwas supportedbyNational InstitutesofHealthGrant GM062791 and the Office of Science (BER), U. S. Department of Energy,cooperative agreement DE-FC02-02ER63446.

1. Krishna S, Semsey S, Sneppen K (2007) Combinatorics of feedback in cellular uptakeand metabolism of small molecules. Proc Natl Acad Sci USA 104:20815–20819.

2. Calvo JM, Matthews RG (1994) The leucine-responsive regulatory protein, a globalregulator of metabolism in Escherichia coli. Microbiol Rev 58:466–490.

3. Yokoyama K, et al. (2006) Feast/famine regulatory proteins (FFRPs): Escherichia coli Lrp,AsnC and related archaeal transcription factors. FEMS Microbiol Rev 30:89–108.

4. Newman EB, Lin R (1995) Leucine-responsive regulatory protein: a global regulator ofgene expression in E. coli. Annu Rev Microbiol 49:747–775.

5. Landgraf JR, Wu J, Calvo JM (1996) Effects of nutrition and growth rate on Lrp levelsin Escherichia coli. J Bacteriol 178:6930–6936.

6. Faith JJ, et al. (2007) Large-scale mapping and validation of Escherichia coli transcrip-tional regulation from a compendium of expression profiles. PLoS Biol 5:e8.

7. Lin R, D’Ari R, Newman EB (1992) Lambda placMu insertions in genes of the leucineregulon: Extension of the regulon to genes not regulated by leucine. J Bacteriol174:1948–1955.

8. Cui Y, Wang Q, Stormo GD, Calvo JM (1995) A consensus sequence for binding of Lrpto DNA. J Bacteriol 177:4872–4880.

9. Marasco R, et al. (1994) In vivo footprinting analysis of Lrp binding to the ilvIHpromoter region of Escherichia coli. J Bacteriol 176:5197–5201.

10. Stauffer LT, Stauffer GV (1994) Characterization of the gcv control region fromEscherichia coli. J Bacteriol 176:6159–6164.

11. Stauffer LT, Fogarty SJ, Stauffer GV (1994) Characterization of the Escherichia coli gcvoperon. Gene 142:17–22.

12. Ernsting BR, Denninger JW, Blumenthal RM, Matthews RG (1993) Regulation of thegltBDF operon of Escherichia coli: How is a leucine-insensitive operon regulated by theleucine-responsive regulatory protein? J Bacteriol 175:7160–7169.

13. Cho BK, Knight EM, Palsson BO (2006) PCR-based tandem epitope tagging system forEscherichia coli genome engineering. BioTechniques 40:67–72.

14. Herring CD, et al. (2005) Immobilization of Escherichia coli RNA polymerase andlocation of binding sites by use of chromatin immunoprecipitation and microarrays. JBacteriol 187:6166–6174.

15. Zheng M, Barrera LO, Ren B, Wu YN (2007) ChIP-chip: Data, model, and analysis.Biometrics 63:787–796.

16. Tani TH, et al. (2002) Adaptation to famine: A family of stationary-phase genesrevealed by microarray analysis. Proc Natl Acad Sci USA 99:13471–13476.

17. Campbell EA, et al. (2001) Structural mechanism for rifampicin inhibition of bacterialRNA polymerase. Cell 104:901–912.

18. Kelly A, et al. (2006) DNA supercoiling and the Lrp protein determine the direc-tionality of fim switch DNA inversion in Escherichia coli K-12. J Bacteriol 188:5356 –5363.

19. Haney SA, Platko JV, Oxender DL, Calvo JM (1992) Lrp, a leucine-responsive protein,regulates branched-chain amino acid transport genes in Escherichia coli. J Bacteriol174:108–115.

20. Platko JV, Willins DA, Calvo JM (1990) The ilvIH operon of Escherichia coli is positivelyregulated. J Bacteriol 172:4563–4570.

21. Rex JH, Aronson BD, Somerville RL (1991) The tdh and serA operons of Escherichia coli:Mutational analysis of the regulatory elements of leucine-responsive genes. J Bacteriol173:5944–5953.

22. Grainger DC, et al. (2005) Studies of the distribution of Escherichia coli cAMP-receptorprotein and RNA polymerase along the E. coli chromosome. Proc Natl Acad Sci USA102:17693–17698.

23. Grainger DC, Hurd D, Goldberg MD, Busby SJ (2006) Association of nucleoid proteinswith coding and non-coding segments of the Escherichia coli genome. Nucleic AcidsRes 34:4642–4652.

24. Grainger DC, et al. (2007) Transcription factor distribution in Escherichia coli: Studieswith FNR protein. Nucleic Acids Res 35:269–278.

25. Shimada T, Ishihama A, Busby SJ, Grainger DC (2008) The Escherichia coli RutR tran-scription factor binds at targets within genes as well as intergenic regions. Nucleic AcidsRes 36:3950–3955.

26. Martinez-Antonio A, Collado-Vides J (2003) Identifying global regulators in transcrip-tional regulatory networks in bacteria. Curr Opin Microbiol 6:482–489.

27. Hung SP, Baldi P, Hatfield GW (2002) Global gene expression profiling in Escherichiacoli K12. The effects of leucine-responsive regulatory protein. J Biol Chem 277:40309–40323.

Cho et al. PNAS � December 9, 2008 � vol. 105 � no. 49 � 19467

MIC

ROBI

OLO

GY

Dow

nloa

ded

by g

uest

on

Aug

ust 1

3, 2

020