Guangyu Zhu, Changsheng Xu Qingming Huang, Wen Gao Liyuan Xing
Wen-Ting Huang Jau-Chi Huang
description
Transcript of Wen-Ting Huang Jau-Chi Huang
Comparative genomics analysis of NtcA regulons in cyanobacteria: Regulation of nitrogen assimilation and its coupling to photosynthesis
Wen-Ting HuangJau-Chi Huang
Zhengchang Su, Vitctor Olman, Fenglou Mao and Ying Xu
Outline
• Introduction• Method• Result• Conclusion
Introduction
• DNA transcription
• mRNA
5’
5’
3’
3’
introncoding region
coding region
5’
5’
3’
3’
5’ 3’
upstream
downstreamtranscription direction
Regulatory region RNA polymerase binding site
Regulation elementsRNA polymerase
• cis-regulation– Regulatory regions of genes
and the regulated genes are on the same chromosome
• Phylogenetic footprinting – Identifies regulatory elements
by finding regions in a set of orthologous non-coding DNA sequences from multiple species.
• Cyanobacteria – bacteria – live in the water– Gram-Negative, oxygenic
phototrophs– Nitrogen control in
cyanobacteria is mediated by NtcA
http://www.ucmp.berkeley.edu/bacteria/cyanointro.html
• NtcA– A protein which regulates the
assimilation of nitrogen.
• NtcA binding site– Base Motif “GTAN8TAC”– ~14 bps– Intron– Nitrogen fixation related genes– -31 downstream has -10 σ70like
box “TAN3T”
High false positive rate
• Too short to identify• 3 methods:
– Coding region– -10 like box– Othologous genes
Materials
• Nine sequenced cyanobacteria genomes were downloaded from the GenBank.
• ftp.ncbi.nih.gov/genomes/Bacteria/
Method
• Step 1:– Prepare training sets– Get the profiles(GTAN8TAC,
TAN3T)
• Step 2:– Scan genomic sequences and score
each motif.
• Step 3:– Decide the cutoff.
Known
• Possible NtcA binding sites (GTAN8TAC)
– Appear in the upstream intergenic regions
– In many cases, there is a –10 like box (TAN3T) in the 31bp downstream regions of the NtcA binding site
transcription unitupstream
31bptranscription unit
Prepare training sets
• They chose 11 genes which are known to be regulated by NtcA from the nine cyanobacterial genomes.
• They used phylogenetic footprinting and identified 51 putative NtcA binding sites.
• These 51 sites constitute the training set A1 for the NtcA binding site.
• The –31 bp downstream regions are further searched for a –10 like box and form the training set B1
A1 & B1
A2&B2
• They collected 12 experimentally verified NtcA binding sites and their downstream from seven other cyanobacteria.
• They also included the sites that failed to find by phylogenetic footprinting.
Profiles
• They combined A1 and A2 to construct the profile of NctA binding sites.
Profiles
• They combined B1 and B2 to construct the profile of –10 like boxes.
Scan genomic sequences
upstream
transcription unit transcription unit
GTAAAGTTAAGTTCCTTCAAAGCATTCGTGG
TTAAAGTTAAGTTCTTTTAAAGCTTTCGTGG
l
ii
thM ihq
ihipItS
11 )]([
)](,[lnmax)(
Scan genomic sequences
upstream
transcription unit transcription unit
GTAAAGTTAAGTTCCTTCAAAGCATTCGTGG
TTAAAGTTAAGTTCTTTTAAAGCTTTCGTGG
l
ii
thM ihq
ihipItS
12 )]([
)](,[lnmax)(
The scoring functions
z
jMjMzM StS
1....1 )(
Orthologous genes
• The presence of similar motifs in the regulatory regions of the orthologous genes can increase the prediction accuracy.
• They predicted two genes in two genomes to be orthologous to each other if they are a pair of reciprocal best hit in BLASTP searches.
Orthologous genes
upstream
transcription unit
Cutoff
• The largest score for the genome to include all the binding sites from that genome in the training sets.
• P-value– p[S(CU)>sc]<0.01 or 0.05
Analysis
Analysis
“GTA________TAC” “TA___T”
Niche of NtcA in cyanobacteria … ?• Some genes bear NtcA
promoters might coordinate photosynthesis and nitrogen fixation.
• RNA polymerase σ-factor in cyanobacteria might bear an NtcA promoter and regulated by NtcA.
Conclusion
• The false positive rate is reduced from 8.2 to 90.9 fold.
• Some binding sites might be missed due to the lack of orthologues in the other genomes.
• NtcA promoters are found for many genes involved in the various stages of photosynthesis process.
Thank You