Enhanced maps of transcription factor binding sites …...2019/07/25 · 85 note that not all...
Transcript of Enhanced maps of transcription factor binding sites …...2019/07/25 · 85 note that not all...
1
Short title: Comparing motif mappers for network inference 1
2
To whom correspondence should be addressed. Tel: +32 9 3313822; Fax: +32 9 3313809; Email: 3
5
Enhanced maps of transcription factor binding sites improve 6
regulatory networks learned from accessible chromatin data 7
8
Shubhada R. Kulkarni1,2,3, D. Marc Jones1,2,3 and Klaas Vandepoele1,2,3 9 10 1Ghent University, Department of Plant Biotechnology and Bioinformatics, Technologiepark 71, 9052 11
Ghent, Belgium 12 2VIB Center for Plant Systems Biology, Technologiepark 71, 9052 Ghent, Belgium 13 3Bioinformatics Institute Ghent, Ghent University, Technologiepark 71, 9052 Ghent, Belgium 14
15
One-sentence summary: Evaluation of transcription factor binding site mapping tools and a protocol to 16
map gene regulatory networks from open chromatin datasets 17
18
AUTHOR CONTRIBUTIONS 19
S.R.K., D.M.J., and K.V. designed the research, S.R.K., and K.V. performed the analyses and S.R.K., D.M.J., 20
and K.V. wrote the manuscript. 21
22
Plant Physiology Preview. Published on July 25, 2019, as DOI:10.1104/pp.19.00605
Copyright 2019 by the American Society of Plant Biologists
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
2
ABSTRACT 23
Determining where transcription factors (TFs) bind in genomes provides insight into which 24
transcriptional programs are active across organs, tissue types, and environmental conditions. Recent 25
advances in high-throughput profiling of regulatory DNA have yielded large amounts of information 26
about chromatin accessibility. Interpreting the functional significance of these datasets requires 27
knowledge of which regulators are likely to bind these regions. This can be achieved by using 28
information about TF binding preferences, or motifs, to identify TF binding events that are likely to be 29
functional. Although different approaches exist to map motifs to DNA sequences, a systematic 30
evaluation of these tools in plants is missing. Here we compare four motif mapping tools widely used in 31
the Arabidopsis (Arabidopsis thaliana) research community and evaluate their performance using 32
chromatin immunoprecipitation datasets for 40 TFs. Downstream gene regulatory network (GRN) 33
reconstruction was found to be sensitive to the motif mapper used. We further show that the low recall 34
of FIMO, one of the most frequently used motif mapping tools, can be overcome by using an Ensemble 35
approach, which combines results from different mapping tools. Several examples are provided 36
demonstrating how the Ensemble approach extends our view on transcriptional control for TFs active in 37
different biological processes. Finally, a protocol is presented to effectively derive more complete cell 38
type-specific GRNs through the integrative analysis of open chromatin regions, known binding site 39
information, and expression datasets. This approach will pave the way to increase our understanding of 40
GRNs in different cellular conditions. 41
42
43
44
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
3
INTRODUCTION 45
Plants are exposed to a wide variety of internal and external signals which need to be correctly 46
processed to facilitate growth and development and to trigger defense responses against environmental 47
stimuli. An important mechanism mediating these signal processing pathways is the control of gene 48
expression. Gene expression is regulated by transcription factors (TFs), proteins which often bind to 49
specific, short DNA sequences and influence gene expression. The identification of functional TF binding 50
is an important step in understanding the biological roles of these regulators. Regulatory links between 51
TFs and target genes together form a gene regulatory network (GRN), which can be used to understand 52
the dynamics of plant processes, such as diverse cellular functions, responses to various external stimuli, 53
and organ development (Song et al., 2016; Sparks et al., 2016; Varala et al., 2018). 54
55
An early and important step in the characterization of GRNs is understanding TF binding preferences, or 56
motifs, as determining potential binding locations of a TF within a genome assists identification of 57
putative target genes. Advancements in technologies that profile regulatory DNA have successfully 58
characterized the binding preferences of many plant TFs (reviewed in Franco-Zorrilla and Solano, 2017). 59
Protein binding microarrays (PBMs), a high throughput experimental technique, determine sequence 60
preferences of TFs by allowing fluorescently labeled proteins to bind to an array of oligonucleotides. 61
Using this technology, TF binding profiles were determined for 63 Arabidopsis (Arabidopsis thaliana) 62
TFs, representing 25 TF families, while Weirauch and co-workers identified motifs for >1000 TFs across 63
131 species (Franco-Zorrilla et al., 2014; Weirauch et al., 2014). Another in vitro assay, DNA affinity 64
purification sequencing (DAP-seq), combines in vitro expressed TFs with next-generation sequencing of a 65
genomic DNA library. Using this technique, binding profiles for 529 TFs in Arabidopsis have been 66
elucidated (O’Malley et al., 2016). In recent years, numerous TF chromatin immunoprecipitation (ChIP) 67
experiments have been performed, expanding our knowledge of TF binding in plants (Heyndrickx et al., 68
2014; Song et al., 2016). Collectively, these binding profiles offer an interesting resource to study TF 69
binding in the Arabidopsis genome for over 900 TFs (Kulkarni et al., 2018). 70
71
The simplest approach to delineate GRNs from these profiles is by naively mapping the TF motifs to the 72
nearest gene promoter. However, the high rate of false positives when mapping motifs to a DNA 73
sequence, especially if the motif is short and degenerate, results in low specificity to identify functional 74
regulatory events (Baxter et al., 2012). To overcome these issues, additional sources of evidence, such as 75
gene co-regulation or evolutionary sequence conservation, are frequently used to define functional 76
binding sites. Based on the hypothesis that a set of co-regulated genes are regulated by a similar cohort 77
of TFs, identification of over-represented sequences in the promoters of these genes can enrich for 78
functional true positives (Michael et al., 2008; Vandepoele et al., 2009; Hickman et al., 2017; Kulkarni et 79
al., 2018). An alternative approach involves filtering potential binding sites using conservation 80
information over large evolutionary distances. This method assumes that functionally important binding 81
sites will be under purifying selection, and as such, will be conserved between species. Filtering motif 82
matches using this metric substantially reduces the false positive rate (Vandepoele et al., 2006; Haudry 83
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
4
et al., 2013; Van de Velde et al., 2014; Burgess et al., 2015; Yu et al., 2015), although it is important to 84
note that not all functional binding events are evolutionary conserved (Muino et al., 2016). 85
86
Recent advances in the profiling of open chromatin have increased our understanding of regulatory DNA 87
in Arabidopsis (Zhang et al., 2012; Sullivan et al., 2014; Lu et al., 2017). Combined with cell type-specific 88
nuclear purification, methods such as Assay for Transposase‐Accessible Chromatin followed by DNA 89
sequencing (ATAC-Seq) offer unprecedented opportunities to identify cell type‐specific TF networks (Lu 90
et al., 2017; Maher et al., 2018; Sijacic et al., 2018). Nevertheless, elucidation of GRNs from chromatin 91
accessibility data requires detailed information about TF binding preferences in order to identify 92
potential binding sites within accessible regions of the genome, and therefore infer TF-target gene 93
regulatory interactions. 94
95
Based on the importance of motif mapping to find locations of potentially functional TF binding, in this 96
study we compared four frequently used motif mapping tools and performed a detailed evaluation of 97
their global performance for 40 TFs in Arabidopsis. We evaluated the similarities and differences 98
between these tools at a TF level and found that differences in tool sensitivity and specificity affect the 99
inference of GRNs. By combining the results from two tools into an Ensemble, we were able to improve 100
the identification of TF-target regulatory interactions in different experimental datasets. Using this 101
Ensemble approach we developed a protocol to elucidate cell type-specific GRNs from ATAC-Seq defined 102
accessible genomic regions. The results of this analysis, relative to the original study, offer a more 103
complete view on gene regulation in shoot apical meristem (SAM) stem and mesophyll cells in 104
Arabidopsis. 105
106
RESULTS 107
Performance of individual motif mapping tools to identify in vivo binding events 108
A wide variety of tools are used in the plant research community to map TF motifs (Supplemental Table 109
S1). We selected and evaluated four frequently used tools to map TF motifs in Arabidopsis: FIMO, 110
Cluster-Buster (CB), matrix-scan (MS), and MOODS (Frith et al., 2003; Turatsinze et al., 2008; Korhonen 111
et al., 2009; Grant et al., 2011). These tools were used to map a set of 66 motifs (corresponding to 40 112
TFs and 19 TF families; see Supplemental Table S2 for TF motif details) onto the Arabidopsis genome 113
(see "Materials and Methods"). The motifs, mainly derived from protein binding microarrays and DAP-114
Seq, were selected based on the availability of experimental ChIP-Seq datasets for the profiled TF. The 115
set of TFs included in this analysis have diverse roles in processes such as the cell cycle, flower 116
development, response to light or hormones, and defense responses. Motif matches (referred to as TF 117
binding sites; TFBSs) reported by the different tools were evaluated by counting the number of TFBSs 118
confirmed by ChIP-Seq datasets (precision). Recall for each tool was calculated as the fraction of regions 119
identified by ChIP-Seq that were covered by a motif match, that is, how many target genes are correctly 120
recovered by motif matches (median values of performance statistics in Table 1). FIMO produced the 121
lowest number of motif matches (2.4 million matches versus 19-34 million matches for the other tools) 122
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
5
and showed the highest precision among all tools. The median precision for FIMO is 5%, compared to 123
2.2-2.4% for the other tools (Figure 1A), indicating that FIMO reports a higher fraction of experimentally 124
supported matches. However, recall is low with the FIMO results as a consequence of the tool predicting 125
approximately 10-fold fewer matches relative to the other tools (22% median recall versus 36-48% for 126
the other tools; Figure 1B). Overall, these results suggest that FIMO misses some real TFBSs based on 127
the ChIP-Seq data, considering all matches. Due to the large variation in the total number of matches 128
predicted by each tool, we also evaluated the tool performance considering only the 7000 highest 129
scoring matches (top7000). The size of this subset was chosen to optimize the compromise between 130
precision and recall for CB (see "Materials and Methods"). Using this subset of matches, the median 131
precision and recall of all tools are similar (Figure 1A and 1B). In order to assess the false positive rate 132
(FPR) for each tool, TF motifs were mapped using shuffled promoter sequences of Arabidopsis genes 133
(see "Materials and Methods"). Due to its stringency, FIMO has the lowest FPR (Figure 1C), while 134
MOODS, which identifies the highest number of matches, has the highest FPR compared to the other 135
tools. Together with the above results, this suggests that many of the matches identified by MOODS are 136
false positives. Overall, the FPR for all tools was below 10%. 137
138
Following the evaluation of mapping tool performances, we next studied the effect of TF motif 139
complexity on the precision and recall values, using the information content (IC) of each motif. Given the 140
clustering pattern in Supplemental Figure S1, all matches predicted by FIMO and CB were considered for 141
this analysis. To examine the effect of motif complexity on the performance measures, 21 TFs were 142
selected for which more than one motif was available. For CB, for 15 TFs, the F1 score increased with 143
increasing motif complexity (Figure 2). For FIMO, however, this trend was observed for 8 TFs only. FIMO, 144
besides implementing a p-value threshold for calling motif matches, has an internal cut-off to restrict 145
spurious matches when used with low complexity motifs. This additional threshold is likely responsible 146
for the quality of the TFBSs found with FIMO being less dependent upon TF motif complexity. Of the TFs 147
selected for the above analysis, 13 had motifs from different sources, such as CisBP and DAP-Seq. We 148
next checked if the source of motifs had an impact on the performance measures. CisBP motifs, derived 149
from PBMs, were on an average shorter than motifs derived from DAP-Seq (average lengths for 150
CisBP=11.67 and DAP-Seq=14.55) and were less complex (average IC for CisBP=8 and DAP-Seq=10; 151
Supplemental Table S2). For ABI5, AGL15, ERF115, GFB3, HB7, and WRKY33, the F1 score was higher for 152
DAP-Seq motifs compared to CisBP. For the remainder of the TFs, where the complexity between the 153
two motifs did not vary, the F1 scores were similar. 154
155
Evaluation of unique motif matches reveals complementary between mapping tools 156
For the TFs included in our benchmark, the varying recovery of true positive matches suggests that each 157
tool performs differently depending on the complexity of the motif (Supplemental Figure S1). To 158
investigate the differences between tools further, we compared the motif matches confirmed by ChIP-159
Seq data for each tool. To account for the large differences in the number of matches reported by each 160
tool, only the top7000 matches per tool and per motif were used in this analysis. Conducting pairwise 161
comparisons between tools reveals that for 12 out of 66 motifs, the TFBSs identified uniquely by CB 162
have high recall rates (Supplemental Figure S2). This pattern is retained when matches found uniquely 163
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
6
by a particular tool, relative to all matches of the other tools combined, are used (Supplemental Figure 164
S3). For 65 motifs, the recall of the top7000 matches uniquely found with CB was larger than zero, 165
making it the only tool to identify functional matches for 98% of all TF motifs considered in this study 166
(Figure 3A). Moreover, for 21 of 65 motifs, the motif mappings from CB were able to achieve recall 167
values between 10% and 38%, considerably higher than the recall rates of other tools, which did not 168
exceed 10% (Figure 3B). This result highlights that CB is able to identify a unique set of functional 169
matches with high ChIP recovery for 32% of the studied motifs. 170
171
Another aspect in which the tools differ is in the length of TFBSs reported. The average motif match, 172
considering all matches, is 17.73 bp for CB, whereas for other tools, it is 13.72 bp (Table 1). In some 173
cases, the TFBSs identified by CB are longer than the motifs mapped. This difference is due to CB 174
merging TFBSs that are located closer than a specified gap parameter, with the default value of this 175
parameter set to 35 bp. To evaluate if the high recall rate of TFBSs unique to CB is due to the merging of 176
close TFBSs, the above analysis was repeated with the gap parameter set to 1 bp. As in the previous 177
results (Supplemental Figure S3), CB is distinct from the other tools by having high precision and recall 178
for a number of samples clustered at the bottom of the heatmap (Supplemental Figure S4). However, 179
relative to the findings when the default values were used, using a 1 bp gap parameter results in the 180
maximum recall reducing from 40% to 25%, potentially due to the unmerged matches of CB no longer 181
being unique to the tool. As a result of these findings, unless specified, all analyses performed with CB in 182
this study use the default gap parameter value. 183
184
Given the observation that the two clusters of tools in Supplemental Figure S3 capture complementary 185
sets of functional TFBS in their top-scoring matches, we next explored how these results can be unified 186
into an Ensemble approach. Comparing the global similarity of unique motif matches reveals that the 187
results from FIMO cluster with those of MOODS and MS, while the results from CB are distinct from the 188
other tools (Supplemental Figures S2 and S3). Due to the similarity of results from FIMO, MOODS, and 189
MS, only the results from FIMO were selected for the Ensemble. FIMO was selected as it achieved the 190
highest precision of the three tools, with the least number of motif matches. All matches found by FIMO 191
were combined with a set of quality matches from CB to overcome the recall problem of FIMO. The top 192
7000 matches, determined as the optimal number of matches to select based on considerations of 193
precision and recall (Supplemental Figure S5), were integrated into all matches of FIMO to form the 194
Ensemble set of matches (see "Materials and Methods"). 195
196
Ensemble motif mapping yields additional target genes when characterizing gene regulatory 197
networks from TF perturbation experiments 198
One of the fields in which motif mapping plays an important role is GRN inference. To validate the 199
applicability of the Ensemble approach to study GRNs in plants, we compared the regulatory links 200
predicted from the motif mapping to lists of genes that are differentially expressed after TF perturbation 201
(DE gene sets). TFs for which perturbation experiments have been conducted covered a wide range of 202
biological processes, such as AGL15 in embryogenesis, AP3 and PI in flower development, BES1 in plant 203
growth and development, FHY3 and PIFs in response to light, WRKY33 in defense response, and EIN3 in 204
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
7
response to ethylene (see "Materials and Methods"). To test the recall of the Ensemble, we investigated 205
whether the TFBSs corresponding to the perturbed TF (referred to as the correct TFBSs) were 206
significantly enriched in the promoters of the DE genes (hypergeometric test, FDR corrected p-207
value≤0.01). Furthermore, the subset of genes from the DE gene set that contained a correct TFBS were 208
compared with experimental ChIP-Seq data for the TF to identify bona fide target genes (see "Materials 209
and Methods"). For 9 of the 10 DE gene sets, a significant enrichment of the correct TFBSs were found 210
for the DE gene sets using the Ensemble. Out of these 9 sets of DE genes, the Ensemble showed better 211
recovery of ChIP-confirmed target genes for 5 sets (PIF4, WRKY33, EIN3, FHY3, and PI) compared to 212
FIMO. For the remaining sets, the rate of recovery was comparable to FIMO. In total, the Ensemble 213
method identified 41 target genes that were missed by FIMO for 10 DE sets (referred to as ‘extra 214
targets’), out of which 32 (78%) were confirmed using ChIP-Seq datasets for the respective TFs (Table 215
S3). For WRKY33 and PI, the Ensemble yielded the largest number of additional ChIP-confirmed target 216
genes: 16 for WRKY33 (Figure 4) and 8 for PI. Moreover, the FIMO matches lacked a significant 217
enrichment of the WRKY33 motif for WRKY33 perturbed genes. The target genes of WRKY33 which were 218
missed by FIMO included ZFAR1/CZF1 (AT2G40140) and ERF1 (AT3G23240), which are both involved in 219
defense response to biotic stimulus (Table 2). Other examples of target genes detected by Ensemble and 220
missed by FIMO included: HEC1 (AT5G67060) in the PI DE gene set, a well-known TF involved in 221
gynoecium development (Gremski et al., 2007), and BLH1 (AT2G35940) in the FHY3 DE set, a gene 222
known to be involved in the response to far-red light (Staneloni et al., 2009). A detailed intersection of 223
the Ensemble motif matches, the DE genes after TF perturbation, and the ChIP targets is shown in 224
Supplemental Figure S6. 225
226
An improved protocol to identify gene regulatory networks starting from accessible 227
chromatin regions 228
The identification of highly accessible open chromatin regions throughout the genome helps to 229
determine the location of potential regulatory elements. Recent advancements in experimental 230
technologies have allowed researchers to map open chromatin regions in specific plant cell types 231
(Maher et al., 2018; Sijacic et al., 2018). However, identifying which TFs are likely to bind these regions, 232
and how they affect gene expression, is still a major challenge. A recent study used ATAC-Seq to identify 233
transposase hypersensitive sites (THSs) specific to stem cells of the shoot apical meristem (SAM) and 234
leaf mesophyll cells (Sijacic et al., 2018). To identify potential TFs that bind to these THSs, Sijacic and co-235
workers used de novo motif discovery to identify over-represented motifs in these regions. Motifs found 236
de novo were compared to known TF motifs to identify potential regulators. The genomic locations of 237
these over-represented motifs, determined using FIMO, were then assigned to the closest gene to 238
identify potential cell type-specific target genes of the associated TFs. By selecting TFs showing cell type-239
specific expression, measured using high rank ratios (RR) in each cell (see "Materials and Methods"), and 240
that also had at least one de novo motif assigned to it, Sijacic et al. reported 23 and 128 TFs in SAM stem 241
and leaf mesophyll cells, respectively. 242
243
This traditional pipeline, besides having multiple steps to identify cell type-specific GRNs, is dependent 244
on de novo motif discovery tools and parameters. Furthermore, linking motifs found de novo with 245
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
8
known TFs can be challenging (Castro-Mondragon et al., 2017). In order to overcome some of these 246
problems, we developed a novel protocol in which the enrichment of TFBSs was directly compared 247
against a set of 2,132 SAM stem cell and 1,508 mesophyll-specific THSs, to identify putative regulators 248
and targets (see "Materials and Methods"). 249
250
Starting from 59 SAM stem cell-specific and 158 mesophyll cell-specific TFs having high RR, we 251
determined which motifs were significantly enriched in the corresponding cell type THSs using both 252
FIMO and Ensemble motif mappings (see "Materials and Methods"). The Ensemble approach reported a 253
larger number of significantly enriched TFBSs compared to FIMO in both cell types. Of 59 TF motifs 254
mapped, 29 were significantly enriched in SAM stem cell-specific THSs when Ensemble mappings were 255
used, whereas 25 motifs were enriched when FIMO motif mappings were used. Whereas 13 motifs 256
correspond to TFs also reported in the original study, 16 of the 29 significantly enriched motifs from the 257
Ensemble set corresponded to TFs that were not described by Sijacic et al. (2018). These TFs include 258
BRC2 from the TCP family, AGL24, AGL27, AGL31, and AGL70 belonging to the MADS family, IDD15 from 259
the C2H2 family, and additional TFs from the zf-HD and C2C2-Dof families. For mesophyll cells, 55% (87 260
out of 158) of the motifs were enriched for mesophyll THS regions using Ensemble motif mappings, 261
whereas for FIMO only 48% (77 out of 158) of the motifs showed a significant enrichment. Eleven of the 262
87 TFs found enriched using Ensemble motif mappings were not reported by Sijacic et al., and included 263
DREB2, AT1G33760, and RAP2.4 belonging to the AP2-EREBP family, CCA1 and LCL1 from a MYB-related 264
family, AT5G50915 from the bHLH family, bZIP60 from the bZIP family, SPL13 from the SBP family, 265
AT1G14580 from the C2H2 family, and WRKY30 from the WRKY family. 266
267
The TFs binding to all enriched motifs identified using the novel protocol in both SAM stem and 268
mesophyll cells were compared with the previously reported 23 and 128 TFs in the respective cell types 269
(Sijacic et al., 2018). Results from the Ensemble method showed enrichment for 13 of 23 motifs, 270
whereas FIMO TFBSs were enriched in 11 of 23 cases. Similarly, for mesophyll cells, 76 and 69 of 128 TF 271
motifs were found to be enriched using the Ensemble and FIMO, respectively (Supplemental Table S4). 272
Overall, our one-step protocol identified 116 regulators showing both significant TFBS enrichment for 273
THSs and cell type-specific expression, of which 23% (n=27) were not reported in the original study. 274
Conversely, for 62 TFs reported by Sijacic and co-workers no significant TFBS enrichment was found 275
using our protocol, suggesting the corresponding motifs do not occur more in the THSs than expected by 276
chance. 277
278
To understand how the choice of motif mapping tool affects GRN construction, we investigated the 279
differences between the Ensemble and FIMO motif mappings based on the putative target genes they 280
identify. In total, the Ensemble identified 6,917 targets for 29 significantly enriched motifs in SAM stem 281
cells, whereas FIMO identified 6,428 targets. To determine whether the extra targets identified using 282
the Ensemble are potentially functional, we evaluated their gene expression in each cell type. We 283
initially counted how many of the targets exhibit a two-fold expression difference (|log2(RR)| > 1) in 284
either of the cell types. Out of 489 extra targets identified by the Ensemble approach in SAM stem cells, 285
171 genes (35%) were expressed and 93 genes (19%) showed cell type-specific expression (-log2(RR) > 1; 286
Supplemental Table S4). The fraction of Ensemble-unique target genes which are expressed in a cell 287
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
9
type-specific manner are consistent across the different TFs (Figure 5A, TFs labelled blue indicate new 288
regulators). Nine of the cell type-specific genes show more than 6-fold higher expression in SAM stem 289
cells and are therefore likely to be bona fide target genes within the SAM stem cell-specific GRN (Table 290
3). Three of these nine genes (AT4G11211, AT5G02450, and AT5G13340) lack experimental evidence 291
about their biological function. The remaining 6 genes are known to be involved in a number of 292
processes based on experimental Gene Ontology annotations: primary root development (ATHB13), 293
xylem development (KNAT1), response to cold (DIN10), salt stress and abscisic acid (GASA14), and 294
defense response to bacterium (ERD5, TGA4). These genes are regulated by a diverse array of TFs, such 295
as IDD7, TCP7, ALC, AGL70, DAG2, AGL27, BRC2, JKD, KNAT1, AGL31, and AGL24, that were both 296
described in the original study or identified here. Interestingly, KNAT1 and TGA4, being TF themselves, 297
are regulated by multiple TFs (IDD7 and JKD regulate KNAT1, ALC and KNAT1 regulate TGA4), suggesting 298
some new transcriptional cascades in the SAM stem cell-specific GRN (Figure 5B). 299
300
For mesophyll cells, 574 of 1,660 new target genes (35%) were expressed in either of the cell types, 301
which is a similar fraction as that reported for SAM stem cells. The percentage of cell type-specific 302
targets identified using the Ensemble motif mappings was 24% for mesophyll cells, corresponding to 402 303
identified targets with higher expression (-log2(RR) < -1) only in mesophyll cells. 29 of these genes had 304
more than 6-fold expression in mesophyll cells (Table 3). The mesophyll cell-specific GRN of highly 305
expressed genes contained 77 regulatory interactions between 29 targets and 42 TFs, with many of 306
these new target genes being regulated by multiple TFs (Figure 5C). Several of the new target genes 307
have a role in hormone signaling, such as AOS, LOX3, and LOX4, reported to be jasmonic acid responsive, 308
and RRTF1, involved in ethylene biosynthesis. Examples of new unknown target genes are AT5G54165 309
(regulated by BIM1, BIM2, BIM3, PIF7, UNE10, and bHLH105), AT3G51660 (regulated by TCP2, TCP17, 310
and SPL1), and AT4G12005 (regulated by ARF7, LRL2, SPL14, CBF1, and CBF4). A complete set of 311
interactions between TFs and target genes in SAM stem and mesophyll cells predicted using the 312
Ensemble TFBS enrichment protocol are available as a Cytoscape network session file (Supplemental 313
Dataset S1). 314
315
DISCUSSION 316
Recent technological developments have made it possible to profile the chromatin state of particular 317
cell types with high specificity (Maher et al., 2018; Sijacic et al., 2018). This specificity has extended to 318
the level of single cells, allowing cell-to-cell variability in chromatin accessibility to be assessed. 319
However, the impact of these studies is dependent on determining the biological relevance of the 320
accessible regions, particularly if those regions are not located within genes. In addition, as the cost of 321
sequencing decreases and as long read sequencing technologies improve, the number of available 322
genome sequences will increase. While methods to annotate genes are relatively mature, methods to 323
annotate non-coding, regulatory regions are less so. One method of understanding the relevance of 324
accessible chromatin regions, and of annotating potential regulatory sequences, is to map known TF 325
binding preferences onto DNA sequences to identify likely locations bound by TFs. While many tools 326
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
10
exist to perform this mapping, each makes certain biological assumptions, and consequently it can be 327
unclear which tool leads to more reliable results in a particular situation. 328
329
In order to address this problem, we performed a detailed evaluation of motif mapping tools to 330
determine regulatory relationships in Arabidopsis. Precision and recall were determined for each tool 331
using ChIP-Seq data to assess true positives, revealing that although vastly different numbers of matches 332
were found for each tool, the ability to identify sites that are supported experimentally was similar when 333
similarly-sized subsets of top scoring matches were taken for each tool. FIMO, which is widely used in 334
the plant science community, gave the best precision within its predicted motif matches, but fails to 335
recover some true motif matches due to its stringent settings. Using a benchmark dataset consisting of 336
40 TFs, we observed that FIMO and CB offer a complementary view on functional TFBSs. We found that 337
when focusing on top7000 matches, despite having a higher false positive rate than FIMO using default 338
settings, CB identified a set of unique motif matches, up to 38% of which were confirmed by ChIP-Seq 339
data. Combining the results of FIMO and CB into an Ensemble set of motif mappings resulted in 340
improved recall relative to FIMO, when motif enrichment of TF perturbation DE gene sets was 341
performed. Overlap of enriched motifs with the ChIP-Seq datasets revealed that, for 5 of the DE gene 342
sets, the Ensemble identified 32 extra functional targets missed by FIMO. Several of these additional TF-343
target regulatory interactions identified using the Ensemble approach are supported by literature. In 344
independent WRKY33 perturbation experiments (Birkenbihl et al., 2012; Sham et al., 2017), half of the 345
extra targets identified by the Ensemble approach were also found to be DE between wrky33 mutants 346
and wild-type Arabidopsis plants (ERF1, FLP / MYB124, DOF1, WRKY15, WRKY30, SZF2, AT2G43140, and 347
AT3G55950). In addition to the known role of WRKY33 in defense responses (Birkenbihl et al., 2012), 348
expression of this TF is also associated with broad stress conditions such as cold, salinity, wounding, and 349
biotic stress (Ma and Bohnert, 2007). Of the 16 extra WRKY33 targets, MYB124, WRKY15, WRKY22, 350
WRKY30, SZF1, and SZF2 have all been found to play roles in a range of stress responses (Sun et al., 351
2007; Xie et al., 2010; Vanderauwera et al., 2012; Scarpeci et al., 2013; Kloth et al., 2016), supporting the 352
proposed function of WRKY33 as a central stress response factor. The identification of BLH1 as an 353
additional target of FHY3, which integrates responses to far-red light and abscisic acid signaling (Wang 354
and Deng, 2002; Tang et al., 2013), suggests a role for FHY3 during germination and early seedling 355
development, as BLH1 is known to be involved in ABA-mediated signaling pathway acting during early 356
plant development (Kim et al., 2013). The finding of FLOWERING BHLH 1 (FBH1) / bHLH80 as an 357
additional target of FHY3 is also consistent with the role the gene has in light signaling, as FBH1 has been 358
found to control CONSTANS, a key photoperiod gene, and influence the response of the circadian clock 359
to temperature (Ito et al., 2012; Nagel et al., 2014). Finally, the function of PI as a floral homeotic gene 360
in the shoot apical meristem to ensure correct floral organ determination (Goto and Meyerowitz, 1994) 361
is in line with the regulation of HEC1, a bHLH transcription factor which also acts downstream of 362
WUSCHEL to control stem cell proliferation (Schuster et al., 2014). The supporting literature for these 363
interactions strongly suggests that the additional targets identified by the Ensemble motif mappings are 364
functional. 365
366
Next, we introduced a novel protocol to learn GRNs from accessible open chromatin regions profiled 367
using cell type-specific ATAC-Seq. Starting from all TFs for which motif information was available, TFBS 368
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
11
enrichment was combined with information about cell type-specific expression to infer GRNs. Both 369
traditional and our new protocol inherently depend on the availability of TF motifs, which is a limitation. 370
However, the protocol employed in this study is independent of both de novo motif discovery and 371
similarity searches of de novo found motifs against motif databases, which is an important step in 372
traditional pipelines to learn regulatory interactions from open chromatin regions (Sullivan et al., 2014; 373
Maher et al., 2018; Sijacic et al., 2018). Moreover, the protocol not only reduces the number of steps to 374
go from cell type-specific THSs to GRNs, but also identifies TFs missed in the previous study by Sijacic et 375
al. (29 and 87 additional significantly enriched TF motifs in SAM stem cells and leaf mesophyll cells, 376
respectively). Conversely, 62 TFs described in the original study were not found to be enriched using our 377
protocol, suggesting there is still room for improvement to learn complete GRNs starting from cell type-378
specific accessible regions. Apart from identifying additional regulators, we observed that the 379
performance of the Ensemble approach surpassed that of FIMO when used to map motifs as part of the 380
protocol reported here. Additional enriched TF motifs were identified using the Ensemble, with 4 381
additional regulators out of 29 total TFs in SAM stem cells and 10 additional TFs out of 87 in mesophyll 382
cells. A striking addition to the set of TF motifs enriched in the SAM stem cell THSs are those of the 383
MADS-box containing genes MAF1/FLM, MAF2, MAF3, and AGL24. All of these genes have been found 384
to influence flowering time, and have positions within a TF network in the SAM that integrates 385
environmental and developmental signals to control flowering (Yu et al., 2002; de Folter et al., 2005; 386
Werner et al., 2005; Rosloski et al., 2010; Capovilla et al., 2017). The motifs of these TFs were found 387
enriched in THSs specific to the SAM stem cells, suggesting that signal integration is occurring in the 388
stem cells at the apex. In addition to these motifs, the motifs corresponding to KNAT1 and AtCSP2 were 389
also enriched. Correspondingly, the expression of both genes has previously been found to be localized 390
to the SAM, with KNAT1 being a homeodomain important for leaf morphogenesis and AtCSP2 involved 391
in the transition to flowering and silique development (Lincoln et al., 1994; Nakaminami et al., 2009). In 392
contrast to the SAM, the additional mesophyll cell-specific enriched motifs contain TFs known to be 393
involved with stress responses, the circadian clock, and growth. DREB2 is involved in controlling 394
drought-responsive genes (Sakuma et al., 2006), while WRKY30 has been found to be important for both 395
biotic and abiotic stress responses (Scarpeci et al., 2013). In addition to stress responses, motifs from 396
TFs involved in the age-related flowering time pathway (SPL13) and the circadian clock (AtCCA1) are 397
enriched (Wang and Tobin, 1998; Xu et al., 2016), consistent with the leaf playing a key role in 398
environmental sensing. Finally, ARF7 is an auxin regulated TF which promotes leaf expansion (Wilmoth 399
et al., 2005). 400
401
Taken together, the additional enriched motifs identified in the SAM stem cell and leaf mesophyll 402
specific THSs are consistent with the central role of the SAM in flowering time control, and of the leaf 403
responding to stress elicitors and circadian clock entrainment. This demonstrates that the Ensemble-404
based approach leads to biologically relevant results that contribute towards a more complete picture of 405
the GRNs active in these two tissues, and that might otherwise be missed when using de novo motif 406
based methods. In addition, the extra target genes identified by the Ensemble, comprising 93 and 402 407
target genes for SAM stem and mesophyll cells, respectively, were found to be highly expressed in the 408
corresponding cell types, suggesting that the unique regulators as well as their targets identified by the 409
Ensemble are biologically relevant. 410
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
12
411
In conclusion, we have shown that an integrative approach, utilizing two complementary motif mapping 412
tools, results in improved power to detect functional TFBSs relative to FIMO, the most frequently used 413
tool. This approach facilitates more accurate inference of GRNs and will be especially important as 414
chromatin accessibility data continues to be collected. While motif mapping alone is insufficient to 415
accurately map functional regulatory interactions, determining likely positions can help direct future 416
experimental work. A supplemental website offering the Ensemble TFBS mapping results for 1,793 TF 417
motifs corresponding to 916 Arabidopsis TFs is available at 418
http://bioinformatics.psb.ugent.be/cig_data/motifmappings_ath/ as a file in BED format. 419
420
MATERIALS AND METHODS 421
Collection of TFBSs 422
The motif collection used for this analysis consisted of 66 Arabidopsis (Arabidopsis thaliana) position 423
weight matrices (PWMs) representing 40 TFs from different sources including CisBP (Weirauch et al., 424
2014), Franco-Zorilla and co-workers (Franco-Zorrilla et al., 2014), Plant Cistrome Database (O’Malley et 425
al., 2016), and JASPAR 2016 (Mathelier et al., 2015). IC of PWMs was calculated using the ‘convert-426
matrix’ command from rsa-tools version 2012-05-25 with –return option set to ‘info’ (Turatsinze et al., 427
2008). TFs were assigned to gene families based on the PlnTFDB 3.0 database (Pérez-Rodríguez et al., 428
2009). 429
430
PWM mapping using different tools 431
Four mapping tools that are widely used in the plant science community were evaluated in this study. 432
Cluster-Buster (version ‘Compiled on Sep 22 2017’; (Frith et al., 2003)) was run with the '–c' parameter 433
set to zero, as the other tools do not offer prediction of motif clusters. For FIMO, default parameters 434
were used (meme version 4.11.4; (Grant et al., 2011)). For MOODS (version 1.9.3; (Korhonen et al., 435
2009)), a p-value threshold of <0.0001 was used to enable comparison to FIMO. This threshold was also 436
used for matrix-scan while all other parameters were set to default (rsa-tools version 2012-05-25; 437
(Turatsinze et al., 2008)). The command lines for the different tools are: 438
439
cbust-linux $PWMfile $seqFile -c 0 -f 1 440
fimo -o $output $PWMfile $seqFile 441
moods_dna.py -m $PWMfile -s $seqFile -p $threshold --batch -o $output 442
matrix-scan -v 1 -matrix_format cb -m $PWMfile -i $seqFile -2str -return limits -return sites -seq_format 443
fasta -o $output 444
445
$threshold was set to 0.0001 (default value for FIMO and matrix-scan) 446
447
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
13
Extraction of promoter regions 448
In addition to the TAIR10 Arabidopsis genome annotation, a set of 5,711 non-coding RNAs (ncRNAs) 449
described by (Liu et al., 2012) were added, resulting in a dataset covering 38,966 genes (Lamesch et al., 450
2012). For all genes a promoter region 5000bp upstream of the translation start site and 1000bp 451
downstream of the translation end site, including introns, were used. If another gene was present 452
upstream of the gene, the region is cut where this upstream gene starts or ends. 453
454
Estimation of recall, precision, and false positive rate 455
For each TF, all PWM matches from each mapping tool were overlapped with publicly available TF ChIP-456
Seq data (Table S5). BEDTools was used to intersect the BED files, using the ‘-f’ option set to 1 for 457
complete overlap (Quinlan, 2014). Precision was calculated as the number of TFBS matches confirmed 458
by ChIP-Seq divided by the total number of matches. Recall was calculated as the number of ChIP-Seq 459
peaks for the studied TF that were covered by motif matches, divided by the total number of ChIP-Seq 460
peaks. 461
462
To calculate the false positive rate (FPR) of the motif mappers, shuffled promoters (n=38,966) were 463
generated by shuffling the sequences of the real promoters. The 66 TFBSs were mapped to these 464
shuffled promoters. Following (Jayaram et al., 2016), actual negatives (ANs) were calculated for every 465
promoter and every motif as the length of the promoter divided by the length of motif. The FPR was 466
then calculated as the number of matches predicted by a specific tool divided by AN. The FPR value for a 467
TFBS is the average over all promoters. 468
469
Selection of optimal number of top scoring matches 470
To define the set of matches of CB to combine with FIMO, we took progressively larger sets of CB 471
matches and evaluated which set size resulted in the highest F1 score, a metric which combines 472
precision and recall (Supplemental Figure S5). The F1 score is the harmonic average of the precision and 473
recall, where an F1 score reaches its best value at 1 (perfect precision and recall) and worst at 0. An 474
optimal F1 score was observed between 7000 and 9000 matches (Supplemental Figure S5). Based on 475
this observation, the top scoring 7000 matches of were selected to keep an optimal balance between 476
precision and recall for the CB matches. The same number was also used to identify performance of 477
individual mapping tool by considering equal number of top scoring matches for Figure 1. 478
479
Enrichment on DE genes after TF perturbation 480
10 publicly available DE gene sets after TF perturbation were used to determine motif enrichment (Table 481
S6). We determined, for each TF, the number of DE genes with a proximal TFBS. The significance of this 482
overlap was determined using the hypergeometric distribution. For each enriched motif, the multiple 483
testing corrected p-value (or q-value) of enrichment is determined using the Benjamini-Hochberg 484
correction. Only q-values ≤0.01 were considered significant. For the motifs that are both enriched in the 485
DE and correspond to the perturbed TF, the subset of genes having that motif were retrieved and 486
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
14
compared with TF ChIP-Seq binding data (denoted ‘ChIP confirmed hits’ in Table 2). The ChIP-Seq data 487
sets used are the same as discussed in the section ‘Estimation of recall, precision, and false positive 488
rate’, Table S5. 489
490
Case study on cell type-specific transposase hypersensitive sites (THSs) 491
Based on the ATAC-Seq datasets from (Sijacic et al., 2018), we defined a set of THSs for stem cell and 492
mesophyll cells. Candidate regulators were predicted using the TFBS information present in the mapping 493
file. We identified a set of specific THSs for both cell types, based on a 2-fold (or higher) difference in the 494
ratio between the stem cell and mesophyll counts, yielding two region files with 2,132 stem cell THSs 495
and 1,508 mesophyll THSs (Table S2). Using the TFBS mappings from FIMO and Ensemble, the 496
significance of the overlap between a specific TFBS and a THS region file was assessed. To select the TF 497
motifs for enrichment analysis, RR for each gene was computed by considering expression ranks from 498
Sijacic et al. (2018). RR was calculated as the ratio of expression rank in stem and mesophyll cells. Genes 499
with -log2(RR) > 1 were called SAM stem cell-specific and -log2(RR) < -1 were called mesophyll specific 500
genes. After this selection, 59 and 158 TFs for the SAM stem cell and mesophyll cell respectively were 501
considered for the analysis. These TFs included the TFs reported in Sijacic et al. (2018). 502
503
The THS region file and the mapped TFBSs for a given tool (after running BEDTools merge per TFBS) 504
were formatted as BED files and the overlap between both files was determined using the BEDTools 505
function intersectBed with ‘-u’ parameter and the ‘–f’ parameter set to 0.5. As such, we obtained for 506
each THS region file and each TFBS an observed number of mapped TFBSs overlapping with THSs 507
(Supplemental Figure S7). To determine the significance of this observed overlap, the expected amount 508
of overlapping TFBS with the same THS region file was determined by shuffling the TFBS mapping bed 509
file 1000 times, using shuffleBed with the ‘-noOverlapping’ option enabled across the pre-defined 510
promoter regions (described in “Extraction of promoter regions” section). The overlap with the THS 511
region file was determined for each shuffled file and the median number of TFBS over all shuffled files 512
was used as a measure for the expected overlap. This estimation was used to calculate the fold 513
enrichment, defined as the ratio between observed overlap and expected overlap by chance. An 514
empirical p-value was determined by counting how many times the expected overlap was bigger than or 515
equal to the observed overlap. Only cases where p-value ≤0.01 were considered as significant. 516
517
Command line for the pipeline 518
# find how many TFBSs ($motiffile) overlap with HS sites ($regionBed) using Bedtools 519
realNumber=`bedtools intersect -a $motiffile -b $regionBed -u -f 0.5 | wc -l` 520
# for nShuffling times generate the shuffled TFBSs, check their overlap with HS sites and save the 521
numbers in “shufflednumbers” file. 522
for i in `seq 1 $nShuffling`; 523
do 524
shuffledFile="shuffled_"$motifid"_"$i".out" 525
bedtools shuffle -i $motiffile -g $chromLength -noOverlapping -excl $motiffile -incl $promoterBed 526
> $shuffledFile 527
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
15
number=`bedtools intersect -a $shuffledFile -b $regionBed -u -f 0.5 | wc -l` 528
echo -e "$i\t$number" >> $shufflednumbers 529
done 530
# calculate the p-value of enrichment 531
countBigger=0 532
for eachNumber in`cat $shufflednumbers`; 533
do 534
if [ $eachNumber -ge realNumber]; 535
then 536
countBigger=$($countBigger+1) 537
done 538
pvalue=$countBigger/$nShuffling 539
540
Statistical analyses 541
542
The significance of the overlap between DE genes and the presence of a proximal TFBS was determined 543
by performing a hypergeometric test using a custom script. Benjamini-Hochberg multiple testing 544
correction was performed on the calculated p-values using the p.adjust function in the statistical 545
programming language R. To determine whether the overlap between TFBSs and THSs was significant, 546
an empirical p-value was calculated by shuffling the promoter sequences as detained in “Materials and 547
Methods”. 548
549
550
SUPPLEMENTAL DATA 551
Supplemental Figure S1. TF level performance of TFBS mapping tools. 552
Supplemental Figure S2. TF level performance of unique matches considering pairwise combinations of 553
tools for top-scoring 7000 matches. 554
Supplemental Figure S3. TF level performance of unique matches considering one tool against all other 555
tools for top-scoring 7000 matches. 556
Supplemental Figure S4. TF level performance of unique matches considering one tool against all other 557
tools for top-scoring 7000 matches and CB gap parameter set to 1. 558
Supplemental Figure S5. The relationship between F1 score and subset size suggests the top 7000 559
highest scoring matches of Cluster-Buster should be used in the Ensemble. 560
Supplemental Figure S6. Overlap analysis for perturbation experiments. 561
Supplemental Figure S7. Cartoon for an improved protocol to identify GRNs starting from accessible 562
chromatin regions. 563
564
Supplemental Table S1. List of publications in plant science community using different mapping tools. 565
Supplemental Table S2. Overview of 66 TF motifs selected to evaluate the performance of motif 566
mapping tools. 567
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
16
Supplemental Table S3. TFBS enrichment results for DE gene sets. 568
Supplemental Table S4. List of TFs considered for ATAC-seq case-study with the distribution of their 569
target genes in stem and mesophyll cells. 570
Supplemental Table S5. Overview of TF ChIP-Seq datasets used for estimation of precision and recall. 571
Supplemental Table S6. Overview of DE gene sets after TF perturbation used for the case-study. 572
573
Supplemental Dataset S1. Cystoscope session file with GRNs in SAM stem and mesophyll cells described 574
in the case-study. 575
576
FUNDING 577
S.R.K. is funded by Research Foundation–Flanders [G001015N]. 578
579
ACKNOWLEDGMENTS 580
We thank Francois Bucchini for the technical assistance to set up the supplemental website. 581
582
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
17
TABLES 583
Table 1. Performance statistics of mapping 66 TF motifs using different motif mapping tools and an 584
Ensemble approach. 585
586
Mapping tool
Total Matches
Total Matches Confirmed
#Bases Average Base Length
Median Precision
Median Recall
Median F1 Score
CB 26,930,509 1,605,815 477,611,367 17.73 2.26% 36.14% 4.36%
FIMO 2,447,772 232,549 33,849,397 13.83 4.91% 22.09% 8.38%
MOODS 34,338,371 1,956,294 467,805,766 13.62 2.37% 48.47% 4.49%
MS 19,970,225 1,030,288 273,845,141 13.71 2.43% 39.32% 4.67%
Ensemble1 2,837,772 291,794 58,252,718 20.53 5.04% 23.72% 8.17% 1 Ensemble = All matches of FIMO + top 7000 scoring matches of CB 587
588
589
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
18
Table 2. List of ChIP confirmed targets only identified by the Ensemble approach when comparing 590
enriched TFBSs with perturbed DE gene sets. 591
592
TF #Extra ChIP confirmed
targets
ChIP confirmed targets
WRKY33 16 AT3G23240(ERF1), AT1G14350(MYB124), AT1G51700(DOF1), AT2G23320(WRKY15), AT2G43140, AT2G01940(IDD15), AT2G40140(SZF2), AT3G55950 (CCR3),
AT2G36960(TKI1), AT3G55980(SZF1), AT3G27785(MYB118), AT4G11070(WRKY41), AT4G01250(WRKY22), AT5G56960, AT5G24110(WRKY30), AT5G56550
FHY3 6 AT2G35940(BLH1), AT1G13020(EIF4B2), AT1G35460(bHLH80), AT2G33860(ARF3), AT2G39130, AT5G01780
PI 8 AT5G67060(HEC1), AT1G08570(ACHT4), AT1G12240(VI2), AT2G19110(HMA4), AT3G56360, AT3G56370, AT4G01120(bZIP54), AT5G07350(Tudor1)
593
594
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
19
Table 3. List of bona fide targets identified in SAM stem and mesophyll cells using the novel TFBS 595
enrichment protocol. 596
597
Cell type # targets showing cell type-specific expression
Target genes
SAM stem 9 AT1G69780 (ATHB13), AT3G30775 (ERD5), AT4G08150 (KNAT1), AT4G11211, AT5G20250 (DIN10), AT5G13340, AT5G10030 (TGA4), AT5G02450, AT5G14920 (GASA14)
Mesophyll 29 AT1G14580, AT1G17420 (LOX3), AT1G19450, AT1G51090, AT1G59870 (PDR8), AT1G61890, AT1G72520 (LOX4), AT1G77760 (NR1), AT2G15020, AT2G26530 (AR781),
AT2G27310, AT2G36990 (SIG6), AT2G39200 (MLO12), AT3G21670 (NPF6.4), AT3G22060, AT3G24190, AT3G51660,
AT3G51895 (AST12), AT3G54020 (AtIPCS1), AT4G02970 (AT7SL-1), AT4G12005, AT4G21570, AT4G34410 (RRTF1),
AT5G41740, AT5G42650 (AOS), AT5G44070 (ARA8), AT5G49520 (WRKY48), AT5G49730 (FRO6), AT5G54165
598
599
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
20
FIGURE LEGENDS 600
Figure 1. Global performances of motif mapping tools in Arabidopsis. (A) and (B) Precision and recall of 601
motif matches considering all matches (in red) and top-scoring 7000 matches (in cyan). (C) Boxplot 602
showing the false positive rate (FPR) for every tool. Boxes indicate the interquartile range of the data, 603
with the median indicated as a horizontal line within the box. The whiskers show the range of the data. 604
The precision, recall, and FPR values were calculated for each of the 66 motifs. 605
606
Figure 2. Variation of motif mapping accuracy in function of TF motif complexity. Scatterplot showing 607
the effect of motif complexity, quantified using the information content of a motif, on F1 scores for 608
Cluster-Buster (CB) and FIMO for 21 TFs. Each motif is visualized through a specific shape indicating the 609
TF it belongs to and colored based on the source of that motif. An increase and decrease in F1 score in 610
function of motif complexity is marked with blue and red, respectively. 611
612
Figure 3. Evaluation of unique motif matches predicted by each tool. (A) Barplot showing the fraction 613
of transcription factor binding sites (TFBSs) with recall>0 considering unique matches in top7000 of one 614
tool compared to the matches of all other tools combined. Whereas green series indicates motifs for 615
which the unique matches reported by that tool do not show a recall above 10% compared to the ChIP-616
Seq data, orange, purple and pink series depict unique motif matches with a recall higher than 10%, 20% 617
and 30%, respectively. (B) Heatmap showing the recall for each tool for TFBSs where CB outperforms the 618
other tools. Only motifs part of the orange, purple, and pink series from panel A where the recall of CB 619
was >10% are shown. 620
621
Figure 4. Overlap analysis for WRKY33 perturbation experiment. UpSetR plot showing the overlap 622
between the FIMO targets, the Ensemble, the perturbed differentially expressed (DE) genes, and the 623
ChIP targets for WRKY33. Overlap of target genes predicted by FIMO and Ensemble with ChIP targets 624
show a better recovery of ChIP-confirmed targets using the Ensemble. 625
626
Figure 5. Results of the new protocol to identify potential regulators in SAM stem and mesophyll cell-627
specific ATAC-seq regions. (A) Barplot showing extra target genes obtained using the Ensemble 628
approach for shoot apical meristem (SAM) stem cell. Grey sections show how many of the extra targets 629
have an expression in either of the two cell types. Shaded regions show the target genes that are 630
specific to SAM stem cells. TFs labelled in brown are the TFs reported in Sijacic et al., 2018 and TFs 631
labelled in blue are the TFs only identified using the new protocol. Only TFs that have ≥ 10 extra targets 632
are shown. (B) and (C) A gene regulatory network (GRN) showing SAM stem and mesophyll cell-specific 633
targets identified by enriched TFs, respectively. Round green/white circles refer to the genes that have 634
higher expression in SAM stem/mesophyll cell, while diamonds represent TFs. All nodes that have an 635
incoming edge are target genes having high SAM stem/mesophyll cell-specific expression. The size of 636
each node corresponds to the expression specificity, determined using the ratio of expression rank (RR), 637
of the gene in the respective cell type. 638
639
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
21
LITERATURE CITED 640
641
Baxter L, Jironkin A, Hickman R, Moore J, Barrington C, Krusche P, Dyer NP, Buchanan-Wollaston V, 642
Tiskin A, Beynon J, Denby K, Ott S (2012) Conserved noncoding sequences highlight shared 643
components of regulatory networks in dicotyledonous plants. Plant Cell 24: 3949-3965 644
Birkenbihl RP, Diezel C, Somssich IE (2012) Arabidopsis WRKY33 is a key transcriptional regulator of 645
hormonal and metabolic responses toward Botrytis cinerea infection. Plant Physiol 159: 266-285 646
Burgess DG, Xu J, Freeling M (2015) Advances in understanding cis regulation of the plant gene with an 647
emphasis on comparative genomics. Curr Opin Plant Biol 27: 141-147 648
Capovilla G, Symeonidi E, Wu R, Schmid M (2017) Contribution of major FLM isoforms to temperature-649
dependent flowering in Arabidopsis thaliana. J Exp Bot 68: 5117-5127 650
Castro-Mondragon JA, Jaeger S, Thieffry D, Thomas-Chollier M, van Helden J (2017) RSAT matrix-651
clustering: dynamic exploration and redundancy reduction of transcription factor binding motif 652
collections. Nucleic Acids Res 45: e119 653
de Folter S, Immink RG, Kieffer M, Parenicova L, Henz SR, Weigel D, Busscher M, Kooiker M, Colombo 654
L, Kater MM, Davies B, Angenent GC (2005) Comprehensive interaction map of the Arabidopsis 655
MADS Box transcription factors. Plant Cell 17: 1424-1433 656
Franco-Zorrilla JM, López-Vidriero I, Carrasco JL, Godoy M, Vera P, Solano R (2014) DNA-binding 657
specificities of plant transcription factors and their potential to define target genes. Proceedings 658
of the National Academy of Sciences 111: 2367-2372 659
Franco-Zorrilla JM, Solano R (2017) Identification of plant transcription factor target sequences. Biochim 660
Biophys Acta 1860: 21-30 661
Frith MC, Li MC, Weng Z (2003) Cluster-Buster: Finding dense clusters of motifs in DNA sequences. 662
Nucleic Acids Res 31: 3666-3668 663
Goto K, Meyerowitz EM (1994) Function and regulation of the Arabidopsis floral homeotic gene 664
PISTILLATA. Genes Dev 8: 1548-1560 665
Grant CE, Bailey TL, Noble WS (2011) FIMO: scanning for occurrences of a given motif. Bioinformatics 666
27: 1017-1018 667
Gremski K, Ditta G, Yanofsky MF (2007) The HECATE genes regulate female reproductive tract 668
development in Arabidopsis thaliana. Development 134: 3593-3601 669
Haudry A, Platts AE, Vello E, Hoen DR, Leclercq M, Williamson RJ, Forczek E, Joly-Lopez Z, Steffen JG, 670
Hazzouri KM (2013) An atlas of over 90,000 conserved noncoding sequences provides insight 671
into crucifer regulatory regions. Nature genetics 45: 891-898 672
Heyndrickx KS, Van de Velde J, Wang C, Weigel D, Vandepoele K (2014) A functional and evolutionary 673
perspective on transcription factor binding in Arabidopsis thaliana. Plant Cell 26: 3894-3910 674
Hickman R, Van Verk MC, Van Dijken AJH, Mendes MP, Vroegop-Vos IA, Caarls L, Steenbergen M, Van 675
der Nagel I, Wesselink GJ, Jironkin A, Talbot A, Rhodes J, De Vries M, Schuurink RC, Denby K, 676
Pieterse CMJ, Van Wees SCM (2017) Architecture and Dynamics of the Jasmonic Acid Gene 677
Regulatory Network. Plant Cell 29: 2086-2105 678
Ito S, Song YH, Josephson-Day AR, Miller RJ, Breton G, Olmstead RG, Imaizumi T (2012) FLOWERING 679
BHLH transcriptional activators control expression of the photoperiodic flowering regulator 680
CONSTANS in Arabidopsis. Proc Natl Acad Sci U S A 109: 3582-3587 681
Jayaram N, Usvyat D, AC RM (2016) Evaluating tools for transcription factor binding site prediction. 682
BMC Bioinformatics 683
Kim D, Cho YH, Ryu H, Kim Y, Kim TH, Hwang I (2013) BLH1 and KNAT3 modulate ABA responses during 684
germination and early seedling development in Arabidopsis. Plant J 75: 755-766 685
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
22
Kloth KJ, Wiegers GL, Busscher-Lange J, van Haarst JC, Kruijer W, Bouwmeester HJ, Dicke M, Jongsma 686
MA (2016) AtWRKY22 promotes susceptibility to aphids and modulates salicylic acid and 687
jasmonic acid signalling. J Exp Bot 67: 3383-3396 688
Korhonen J, Martinmaki P, Pizzi C, Rastas P, Ukkonen E (2009) MOODS: fast search for position weight 689
matrix matches in DNA sequences. Bioinformatics 25: 3181-3182 690
Kulkarni SR, Vaneechoutte D, Van de Velde J, Vandepoele K (2018) TF2Network: predicting 691
transcription factor regulators and gene regulatory networks in Arabidopsis using publicly 692
available binding site information. Nucleic Acids Res 46: e31 693
Lamesch P, Berardini TZ, Li D, Swarbreck D, Wilks C, Sasidharan R, Muller R, Dreher K, Alexander DL, 694
Garcia-Hernandez M, Karthikeyan AS, Lee CH, Nelson WD, Ploetz L, Singh S, Wensel A, Huala E 695
(2012) The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. 696
Nucleic Acids Res 40: D1202-1210 697
Lincoln C, Long J, Yamaguchi J, Serikawa K, Hake S (1994) A knotted1-like homeobox gene in 698
Arabidopsis is expressed in the vegetative meristem and dramatically alters leaf morphology 699
when overexpressed in transgenic plants. Plant Cell 6: 1859-1876 700
Liu J, Jung C, Xu J, Wang H, Deng S, Bernad L, Arenas-Huertero C, Chua NH (2012) Genome-wide 701
analysis uncovers regulation of long intergenic noncoding RNAs in Arabidopsis. Plant Cell 24: 702
4333-4345 703
Lu Z, Hofmeister BT, Vollmers C, DuBois RM, Schmitz RJ (2017) Combining ATAC-seq with nuclei sorting 704
for discovery of cis-regulatory regions in plant genomes. Nucleic Acids Res 45: e41 705
Ma S, Bohnert HJ (2007) Integration of Arabidopsis thaliana stress-related transcript profiles, promoter 706
structures, and cell-specific expression. Genome Biol 8: R49 707
Maher KA, Bajic M, Kajala K, Reynoso M, Pauluzzi G, West DA, Zumstein K, Woodhouse M, Bubb K, 708
Dorrity MW, Queitsch C, Bailey-Serres J, Sinha N, Brady SM, Deal RB (2018) Profiling of 709
Accessible Chromatin Regions across Multiple Plant Species and Cell Types Reveals Common 710
Gene Regulatory Principles and New Control Modules. Plant Cell 30: 15-36 711
Mathelier A, Fornes O, Arenillas DJ, Chen C-y, Denay G, Lee J, Shi W, Shyr C, Tan G, Worsley-Hunt R 712
(2015) JASPAR 2016: a major expansion and update of the open-access database of transcription 713
factor binding profiles. Nucleic acids research: gkv1176 714
Michael TP, Mockler TC, Breton G, McEntee C, Byer A, Trout JD, Hazen SP, Shen R, Priest HD, Sullivan 715
CM, Givan SA, Yanovsky M, Hong F, Kay SA, Chory J (2008) Network discovery pipeline 716
elucidates conserved time-of-day-specific cis-regulatory modules. PLoS Genet 4: e14 717
Muino JM, de Bruijn S, Pajoro A, Geuten K, Vingron M, Angenent GC, Kaufmann K (2016) Evolution of 718
DNA-Binding Sites of a Floral Master Regulatory Transcription Factor. Mol Biol Evol 33: 185-200 719
Nagel DH, Pruneda-Paz JL, Kay SA (2014) FBH1 affects warm temperature responses in the Arabidopsis 720
circadian clock. Proc Natl Acad Sci U S A 111: 14595-14600 721
Nakaminami K, Hill K, Perry SE, Sentoku N, Long JA, Karlson DT (2009) Arabidopsis cold shock domain 722
proteins: relationships to floral and silique development. J Exp Bot 60: 1047-1062 723
O’Malley RC, Huang S-sC, Song L, Lewsey MG, Bartlett A, Nery JR, Galli M, Gallavotti A, Ecker JR (2016) 724
Cistrome and Epicistrome Features Shape the Regulatory DNA Landscape. Cell 165: 1280-1292 725
Pérez-Rodríguez P, Riano-Pachon DM, Corrêa LGG, Rensing SA, Kersten B, Mueller-Roeber B (2009) 726
PlnTFDB: updated content and new features of the plant transcription factor database. Nucleic 727
acids research: gkp805 728
Quinlan AR (2014) BEDTools: The Swiss-Army Tool for Genome Feature Analysis. Curr Protoc 729
Bioinformatics 47: 11 12 11-34 730
Rosloski SM, Jali SS, Balasubramanian S, Weigel D, Grbic V (2010) Natural diversity in flowering 731
responses of Arabidopsis thaliana caused by variation in a tandem gene array. Genetics 186: 732
263-276 733
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
23
Sakuma Y, Maruyama K, Osakabe Y, Qin F, Seki M, Shinozaki K, Yamaguchi-Shinozaki K (2006) 734
Functional analysis of an Arabidopsis transcription factor, DREB2A, involved in drought-735
responsive gene expression. Plant Cell 18: 1292-1309 736
Scarpeci TE, Zanor MI, Mueller-Roeber B, Valle EM (2013) Overexpression of AtWRKY30 enhances 737
abiotic stress tolerance during early growth stages in Arabidopsis thaliana. Plant Mol Biol 83: 738
265-277 739
Schuster C, Gaillochet C, Medzihradszky A, Busch W, Daum G, Krebs M, Kehle A, Lohmann JU (2014) A 740
regulatory framework for shoot stem cell control integrating metabolic, transcriptional, and 741
phytohormone signals. Dev Cell 28: 438-449 742
Sham A, Moustafa K, Al-Shamisi S, Alyan S, Iratni R, AbuQamar S (2017) Microarray analysis of 743
Arabidopsis WRKY33 mutants in response to the necrotrophic fungus Botrytis cinerea. PLoS One 744
12: e0172343 745
Sijacic P, Bajic M, McKinney EC, Meagher RB, Deal RB (2018) Changes in chromatin accessibility 746
between Arabidopsis stem cells and mesophyll cells illuminate cell type-specific transcription 747
factor networks. Plant J 94: 215-231 748
Song L, Huang S-sC, Wise A, Castanon R, Nery JR, Chen H, Watanabe M, Thomas J, Bar-Joseph Z, Ecker 749
JR (2016) A transcription factor hierarchy defines an environmental stress response network. 750
Science 354: aag1550 751
Sparks EE, Drapek C, Gaudinier A, Li S, Ansariola M, Shen N, Hennacy JH, Zhang J, Turco G, Petricka JJ, 752
Foret J, Hartemink AJ, Gordan R, Megraw M, Brady SM, Benfey PN (2016) Establishment of 753
Expression in the SHORTROOT-SCARECROW Transcriptional Cascade through Opposing Activities 754
of Both Activators and Repressors. Dev Cell 39: 585-596 755
Staneloni RJ, Rodriguez-Batiller MJ, Legisa D, Scarpin MR, Agalou A, Cerdan PD, Meijer AH, Ouwerkerk 756
PB, Casal JJ (2009) Bell-like homeodomain selectively regulates the high-irradiance response of 757
phytochrome A. Proc Natl Acad Sci U S A 106: 13624-13629 758
Sullivan AM, Arsovski AA, Lempe J, Bubb KL, Weirauch MT, Sabo PJ, Sandstrom R, Thurman RE, Neph 759
S, Reynolds AP (2014) Mapping and dynamics of regulatory DNA and transcription factor 760
networks in A. thaliana. Cell reports 8: 2015-2030 761
Sun J, Jiang H, Xu Y, Li H, Wu X, Xie Q, Li C (2007) The CCCH-type zinc finger proteins AtSZF1 and AtSZF2 762
regulate salt stress responses in Arabidopsis. Plant Cell Physiol 48: 1148-1158 763
Tang W, Ji Q, Huang Y, Jiang Z, Bao M, Wang H, Lin R (2013) FAR-RED ELONGATED HYPOCOTYL3 and 764
FAR-RED IMPAIRED RESPONSE1 transcription factors integrate light and abscisic acid signaling in 765
Arabidopsis. Plant Physiol 163: 857-866 766
Turatsinze JV, Thomas-Chollier M, Defrance M, van Helden J (2008) Using RSAT to scan genome 767
sequences for transcription factor binding sites and cis-regulatory modules. Nat Protoc 3: 1578-768
1588 769
Van de Velde J, Heyndrickx KS, Vandepoele K (2014) Inference of transcriptional networks in 770
Arabidopsis through conserved noncoding sequence analysis. The Plant Cell 26: 2729-2745 771
Vandepoele K, Casneuf T, Van de Peer Y (2006) Identification of novel regulatory modules in 772
dicotyledonous plants using expression data and comparative genomics. Genome Biol 7: R103 773
Vandepoele K, Quimbaya M, Casneuf T, De Veylder L, Van de Peer Y (2009) Unraveling transcriptional 774
control in Arabidopsis using cis-regulatory elements and coexpression networks. Plant Physiol 775
150: 535-546 776
Vanderauwera S, Vandenbroucke K, Inze A, van de Cotte B, Muhlenbock P, De Rycke R, Naouar N, Van 777
Gaever T, Van Montagu MC, Van Breusegem F (2012) AtWRKY15 perturbation abolishes the 778
mitochondrial stress response that steers osmotic stress tolerance in Arabidopsis. Proc Natl 779
Acad Sci U S A 109: 20113-20118 780
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
24
Varala K, Marshall-Colon A, Cirrone J, Brooks MD, Pasquino AV, Leran S, Mittal S, Rock TM, Edwards 781
MB, Kim GJ, Ruffel S, McCombie WR, Shasha D, Coruzzi GM (2018) Temporal transcriptional 782
logic of dynamic regulatory networks underlying nitrogen signaling and use in plants. Proc Natl 783
Acad Sci U S A 115: 6494-6499 784
Wang H, Deng XW (2002) Arabidopsis FHY3 defines a key phytochrome A signaling component directly 785
interacting with its homologous partner FAR1. EMBO J 21: 1339-1349 786
Wang ZY, Tobin EM (1998) Constitutive expression of the CIRCADIAN CLOCK ASSOCIATED 1 (CCA1) gene 787
disrupts circadian rhythms and suppresses its own expression. Cell 93: 1207-1217 788
Weirauch MT, Yang A, Albu M, Cote AG, Montenegro-Montero A, Drewe P, Najafabadi HS, Lambert 789
SA, Mann I, Cook K, Zheng H, Goity A, van Bakel H, Lozano JC, Galli M, Lewsey MG, Huang E, 790
Mukherjee T, Chen X, Reece-Hoyes JS, Govindarajan S, Shaulsky G, Walhout AJM, Bouget FY, 791
Ratsch G, Larrondo LF, Ecker JR, Hughes TR (2014) Determination and inference of eukaryotic 792
transcription factor sequence specificity. Cell 158: 1431-1443 793
Werner JD, Borevitz JO, Warthmann N, Trainer GT, Ecker JR, Chory J, Weigel D (2005) Quantitative trait 794
locus mapping and DNA array hybridization identify an FLM deletion as a cause for natural 795
flowering-time variation. Proc Natl Acad Sci U S A 102: 2460-2465 796
Wilmoth JC, Wang S, Tiwari SB, Joshi AD, Hagen G, Guilfoyle TJ, Alonso JM, Ecker JR, Reed JW (2005) 797
NPH4/ARF7 and ARF19 promote leaf expansion and auxin-induced lateral root formation. Plant J 798
43: 118-130 799
Xie Z, Li D, Wang L, Sack FD, Grotewold E (2010) Role of the stomatal development regulators 800
FLP/MYB88 in abiotic stress responses. Plant J 64: 731-739 801
Xu M, Hu T, Zhao J, Park MY, Earley KW, Wu G, Yang L, Poethig RS (2016) Developmental Functions of 802
miR156-Regulated SQUAMOSA PROMOTER BINDING PROTEIN-LIKE (SPL) Genes in Arabidopsis 803
thaliana. PLoS Genet 12: e1006263 804
Yu CP, Chen SC, Chang YM, Liu WY, Lin HH, Lin JJ, Chen HJ, Lu YJ, Wu YH, Lu MY, Lu CH, Shih AC, Ku MS, 805
Shiu SH, Wu SH, Li WH (2015) Transcriptome dynamics of developing maize leaves and 806
genomewide prediction of cis elements and their cognate transcription factors. Proc Natl Acad 807
Sci U S A 112: E2477-2486 808
Yu H, Xu Y, Tan EL, Kumar PP (2002) AGAMOUS-LIKE 24, a dosage-dependent mediator of the flowering 809
signals. Proc Natl Acad Sci U S A 99: 16336-16341 810
Zhang W, Zhang T, Wu Y, Jiang J (2012) Genome-wide identification of regulatory DNA elements and 811
protein-binding footprints using signatures of open chromatin in Arabidopsis. The Plant Cell 24: 812
2719-2731 813
814
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Cluster-Buster FIMO matrix-scan MOODS
A B C
Cluster-Buster FIMO matrix-scanMOODS
Pre
cisi
on
Rec
all
FP
R
Cluster-Buster FIMO matrix-scan MOODS
Figure 1. Global performances of motif mapping tools in Arabidopsis.(A) and (B) Precision and recall of motif matches considering all matches (in red) and top-scoring 7000 matches (in cyan). (C) Boxplot showing the false positive rate (FPR) for every tool. Boxes indicate the interquartile range of the data, with the medianindicated as a horizontal line within the box. The whiskers show the range of the data. The precision, recall, and FPR values were calculated for each of the 66 motifs.
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Cluster-Buster
FIMO
Motif source
Increase in F1 score with complexity
Information content of motifs
F1
scor
eF
1 sc
ore
Decrease in F1 score with complexity
Figure 2. Variation of motif mapping accuracy in function of TF motif complexity. Scatterplot showing
the effect of motif complexity, quantified using the information content of a motif, on F1 scores for Cluster-
Buster (CB) and FIMO for 21 TFs. Each motif is visualized through a specific shape indicating the TF it
belongs to and colored based on the source of that motif. An increase and decrease in F1 score in function
of motif complexity is marked with blue and red, respectively
Information content of motifs
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Recall
Cluster-Buster FIMO matrix-scan MOODS
Cluster-BusterFIMO matrix-scanMOODS
A B
Figure 3. Evaluation of unique motif matches predicted by each tool. (A) Barplot showing the fraction of TFBS with recall>0
considering unique matches in top7000 of one tool compared to the matches of all other tools combined. Whereas green series
indicates motifs for which the unique matches reported by that tool do not show a recall above 10% compared to the ChIP-Seq data,
orange, purple and pink series depict unique motif matches with a recall higher than 10%, 20% and 30%, respectively. (B) Heatmap
showing the recall for each tool for TFBSs where CB outperforms the other tools. Only motifs part of orange, purple and pink series
from panel A where the recall of CB was >10% are shown.
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
17,86216,432
Number of genes
Figure 4. Overlap analysis for WRKY33 perturbation experiment. UpSetR plot showing the overlap between the FIMO targets, the Ensemble, the perturbed DE genes and the ChIP targets for WRKY33. Overlap of target genes predicted by FIMO and Ensemble with ChIP targets show a better recovery of ChIP-confirmed targets using Ensemble.
1311,558
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
A B
Targets genes with SAM stem cell-specific expression
TFs from Sijacic et al. TFs from advanced protocol
KN
AT
1
Targets genes with mesophyll cell-specific expression
AGL27
AGL31
ALC
TGA4
AT4G11211
IDD7
BRC2
KNAT1
ERD5
DIN10
AGL70 TCP7
GASA14
AGL24ATHB13
AT5G02450
JKD
AT5G13340
DAG2
AOS
ARA8
DREB2
AT1G76580
LOX4
TCP2
AT1G14580
RVE1
At2g45680
AT7SL-1
ATCBF2
AtIPCS1
ATWRKY28
AtWIND1
AT2G15020
LOX3ATWRKY27
AT1G75490
AT1G19210
AT4G28140
ANAC029
RAP2.4
CBF1
NLP7
AT5G41740
AST12
ATMLO12
ARF1-BPATSPL14
AT3G24190
ARF7
AT1G61890
AT1G51090
AT3G51660
SPL1
CAMTA5
NR1
RRTF1
CAMTA3AT4G21570
AtNPF6.4
CBF4LRL2
AT4G12005
AT1G19450
BIM3
PDR8ATFRO6
AT3G22060
ATSIG6
ATWRKY48
AT5G50915 PTF1
CAMTA2
TCP17
AT2G27310
AtbZIP16
AT5G54165
RAP21
AGL20
BIM1
UNE10
ANAC055BIM2
bHLH105
ATBZIP54
PIF7
ATWRKY33 ATWRKY40
AR781
C
Figure 5. Results of the new protocol to identify potential regulators in SAM stem and mesophyll cell-specific ATAC-seq regions. (A) Barplot showing extra target genes obtained
using the Ensemble approach for shoot apical meristem (SAM) stem cell. Grey sections show how many of the extra targets have expression in either of the two cell types. Shaded regions
show the target genes that are specific to SAM stem cells. TFs labelled in brown are the TFs reported in Sijacic et al., 2018 and TFs labelled in blue are the TFs only identified using the new
protocol. Only the TFs that have ≥ 10 extra targets are shown. (B) and (C) A gene regulatory network (GRN) showing SAM stem and mesophyll cell-specific targets identified by enriched TFs
respectively. Round green/white circles refer to the genes that have higher expression in SAM stem/mesophyll cell, while diamonds represent TFs. All nodes that have an incoming edge are
target genes having high SAM stem/mesophyll cell-specific expression. The size of each node corresponds to the expression specificity, determined using the ratio of expression rank (RR),
of the gene in the respective cell type.
Cell-specific targets
Expressed targets
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Parsed CitationsBaxter L, Jironkin A, Hickman R, Moore J, Barrington C, Krusche P, Dyer NP, Buchanan-Wollaston V, Tiskin A, Beynon J, Denby K, OttS (2012) Conserved noncoding sequences highlight shared components of regulatory networks in dicotyledonous plants. Plant Cell24: 3949-3965
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Birkenbihl RP, Diezel C, Somssich IE (2012) Arabidopsis WRKY33 is a key transcriptional regulator of hormonal and metabolicresponses toward Botrytis cinerea infection. Plant Physiol 159: 266-285
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Burgess DG, Xu J, Freeling M (2015) Advances in understanding cis regulation of the plant gene with an emphasis on comparativegenomics. Curr Opin Plant Biol 27: 141-147
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Capovilla G, Symeonidi E, Wu R, Schmid M (2017) Contribution of major FLM isoforms to temperature-dependent flowering inArabidopsis thaliana. J Exp Bot 68: 5117-5127
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Castro-Mondragon JA, Jaeger S, Thieffry D, Thomas-Chollier M, van Helden J (2017) RSAT matrix-clustering: dynamic exploration andredundancy reduction of transcription factor binding motif collections. Nucleic Acids Res 45: e119
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
de Folter S, Immink RG, Kieffer M, Parenicova L, Henz SR, Weigel D, Busscher M, Kooiker M, Colombo L, Kater MM, Davies B,Angenent GC (2005) Comprehensive interaction map of the Arabidopsis MADS Box transcription factors. Plant Cell 17: 1424-1433
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Franco-Zorrilla JM, López-Vidriero I, Carrasco JL, Godoy M, Vera P, Solano R (2014) DNA-binding specificities of plant transcriptionfactors and their potential to define target genes. Proceedings of the National Academy of Sciences 111: 2367-2372
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Franco-Zorrilla JM, Solano R (2017) Identification of plant transcription factor target sequences. Biochim Biophys Acta 1860: 21-30Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Frith MC, Li MC, Weng Z (2003) Cluster-Buster: Finding dense clusters of motifs in DNA sequences. Nucleic Acids Res 31: 3666-3668Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Goto K, Meyerowitz EM (1994) Function and regulation of the Arabidopsis floral homeotic gene PISTILLATA. Genes Dev 8: 1548-1560Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Grant CE, Bailey TL, Noble WS (2011) FIMO: scanning for occurrences of a given motif. Bioinformatics 27: 1017-1018Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Gremski K, Ditta G, Yanofsky MF (2007) The HECATE genes regulate female reproductive tract development in Arabidopsis thaliana.Development 134: 3593-3601
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Haudry A, Platts AE, Vello E, Hoen DR, Leclercq M, Williamson RJ, Forczek E, Joly-Lopez Z, Steffen JG, Hazzouri KM (2013) An atlas ofover 90,000 conserved noncoding sequences provides insight into crucifer regulatory regions. Nature genetics 45: 891-898
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Heyndrickx KS, Van de Velde J, Wang C, Weigel D, Vandepoele K (2014) A functional and evolutionary perspective on transcriptionfactor binding in Arabidopsis thaliana. Plant Cell 26: 3894-3910
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Hickman R, Van Verk MC, Van Dijken AJH, Mendes MP, Vroegop-Vos IA, Caarls L, Steenbergen M, Van der Nagel I, Wesselink GJ,Jironkin A, Talbot A, Rhodes J, De Vries M, Schuurink RC, Denby K, Pieterse CMJ, Van Wees SCM (2017) Architecture and Dynamics ofthe Jasmonic Acid Gene Regulatory Network. Plant Cell 29: 2086-2105
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Ito S, Song YH, Josephson-Day AR, Miller RJ, Breton G, Olmstead RG, Imaizumi T (2012) FLOWERING BHLH transcriptional activatorscontrol expression of the photoperiodic flowering regulator CONSTANS in Arabidopsis. Proc Natl Acad Sci U S A 109: 3582-3587
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Jayaram N, Usvyat D, AC RM (2016) Evaluating tools for transcription factor binding site prediction. BMC BioinformaticsPubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Kim D, Cho YH, Ryu H, Kim Y, Kim TH, Hwang I (2013) BLH1 and KNAT3 modulate ABA responses during germination and earlyseedling development in Arabidopsis. Plant J 75: 755-766
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Kloth KJ, Wiegers GL, Busscher-Lange J, van Haarst JC, Kruijer W, Bouwmeester HJ, Dicke M, Jongsma MA (2016) AtWRKY22promotes susceptibility to aphids and modulates salicylic acid and jasmonic acid signalling. J Exp Bot 67: 3383-3396
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Korhonen J, Martinmaki P, Pizzi C, Rastas P, Ukkonen E (2009) MOODS: fast search for position weight matrix matches in DNAsequences. Bioinformatics 25: 3181-3182
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Kulkarni SR, Vaneechoutte D, Van de Velde J, Vandepoele K (2018) TF2Network: predicting transcription factor regulators and generegulatory networks in Arabidopsis using publicly available binding site information. Nucleic Acids Res 46: e31
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Lamesch P, Berardini TZ, Li D, Swarbreck D, Wilks C, Sasidharan R, Muller R, Dreher K, Alexander DL, Garcia-Hernandez M,Karthikeyan AS, Lee CH, Nelson WD, Ploetz L, Singh S, Wensel A, Huala E (2012) The Arabidopsis Information Resource (TAIR):improved gene annotation and new tools. Nucleic Acids Res 40: D1202-1210
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Lincoln C, Long J, Yamaguchi J, Serikawa K, Hake S (1994) A knotted1-like homeobox gene in Arabidopsis is expressed in thevegetative meristem and dramatically alters leaf morphology when overexpressed in transgenic plants. Plant Cell 6: 1859-1876
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Liu J, Jung C, Xu J, Wang H, Deng S, Bernad L, Arenas-Huertero C, Chua NH (2012) Genome-wide analysis uncovers regulation of longintergenic noncoding RNAs in Arabidopsis. Plant Cell 24: 4333-4345
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Lu Z, Hofmeister BT, Vollmers C, DuBois RM, Schmitz RJ (2017) Combining ATAC-seq with nuclei sorting for discovery of cis-regulatory regions in plant genomes. Nucleic Acids Res 45: e41
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Ma S, Bohnert HJ (2007) Integration of Arabidopsis thaliana stress-related transcript profiles, promoter structures, and cell-specificexpression. Genome Biol 8: R49
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Maher KA, Bajic M, Kajala K, Reynoso M, Pauluzzi G, West DA, Zumstein K, Woodhouse M, Bubb K, Dorrity MW, Queitsch C, Bailey-Serres J, Sinha N, Brady SM, Deal RB (2018) Profiling of Accessible Chromatin Regions across Multiple Plant Species and Cell TypesReveals Common Gene Regulatory Principles and New Control Modules. Plant Cell 30: 15-36
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Mathelier A, Fornes O, Arenillas DJ, Chen C-y, Denay G, Lee J, Shi W, Shyr C, Tan G, Worsley-Hunt R (2015) JASPAR 2016: a majorexpansion and update of the open-access database of transcription factor binding profiles. Nucleic acids research: gkv1176
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Michael TP, Mockler TC, Breton G, McEntee C, Byer A, Trout JD, Hazen SP, Shen R, Priest HD, Sullivan CM, Givan SA, Yanovsky M,Hong F, Kay SA, Chory J (2008) Network discovery pipeline elucidates conserved time-of-day-specific cis-regulatory modules. PLoSGenet 4: e14
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Muino JM, de Bruijn S, Pajoro A, Geuten K, Vingron M, Angenent GC, Kaufmann K (2016) Evolution of DNA-Binding Sites of a FloralMaster Regulatory Transcription Factor. Mol Biol Evol 33: 185-200 www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Nagel DH, Pruneda-Paz JL, Kay SA (2014) FBH1 affects warm temperature responses in the Arabidopsis circadian clock. Proc Natl AcadSci U S A 111: 14595-14600
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Nakaminami K, Hill K, Perry SE, Sentoku N, Long JA, Karlson DT (2009) Arabidopsis cold shock domain proteins: relationships to floraland silique development. J Exp Bot 60: 1047-1062
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
O'Malley RC, Huang S-sC, Song L, Lewsey MG, Bartlett A, Nery JR, Galli M, Gallavotti A, Ecker JR (2016) Cistrome and EpicistromeFeatures Shape the Regulatory DNA Landscape. Cell 165: 1280-1292
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Pérez-Rodríguez P, Riano-Pachon DM, Corrêa LGG, Rensing SA, Kersten B, Mueller-Roeber B (2009) PlnTFDB: updated content andnew features of the plant transcription factor database. Nucleic acids research: gkp805
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Quinlan AR (2014) BEDTools: The Swiss-Army Tool for Genome Feature Analysis. Curr Protoc Bioinformatics 47: 11 12 11-34Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Rosloski SM, Jali SS, Balasubramanian S, Weigel D, Grbic V (2010) Natural diversity in flowering responses of Arabidopsis thalianacaused by variation in a tandem gene array. Genetics 186: 263-276
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Sakuma Y, Maruyama K, Osakabe Y, Qin F, Seki M, Shinozaki K, Yamaguchi-Shinozaki K (2006) Functional analysis of an Arabidopsistranscription factor, DREB2A, involved in drought-responsive gene expression. Plant Cell 18: 1292-1309
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Scarpeci TE, Zanor MI, Mueller-Roeber B, Valle EM (2013) Overexpression of AtWRKY30 enhances abiotic stress tolerance duringearly growth stages in Arabidopsis thaliana. Plant Mol Biol 83: 265-277
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Schuster C, Gaillochet C, Medzihradszky A, Busch W, Daum G, Krebs M, Kehle A, Lohmann JU (2014) A regulatory framework for shootstem cell control integrating metabolic, transcriptional, and phytohormone signals. Dev Cell 28: 438-449
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Sham A, Moustafa K, Al-Shamisi S, Alyan S, Iratni R, AbuQamar S (2017) Microarray analysis of Arabidopsis WRKY33 mutants inresponse to the necrotrophic fungus Botrytis cinerea. PLoS One 12: e0172343
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Sijacic P, Bajic M, McKinney EC, Meagher RB, Deal RB (2018) Changes in chromatin accessibility between Arabidopsis stem cells andmesophyll cells illuminate cell type-specific transcription factor networks. Plant J 94: 215-231
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Song L, Huang S-sC, Wise A, Castanon R, Nery JR, Chen H, Watanabe M, Thomas J, Bar-Joseph Z, Ecker JR (2016) A transcriptionfactor hierarchy defines an environmental stress response network. Science 354: aag1550
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Sparks EE, Drapek C, Gaudinier A, Li S, Ansariola M, Shen N, Hennacy JH, Zhang J, Turco G, Petricka JJ, Foret J, Hartemink AJ,Gordan R, Megraw M, Brady SM, Benfey PN (2016) Establishment of Expression in the SHORTROOT-SCARECROW TranscriptionalCascade through Opposing Activities of Both Activators and Repressors. Dev Cell 39: 585-596
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Staneloni RJ, Rodriguez-Batiller MJ, Legisa D, Scarpin MR, Agalou A, Cerdan PD, Meijer AH, Ouwerkerk PB, Casal JJ (2009) Bell-likehomeodomain selectively regulates the high-irradiance response of phytochrome A. Proc Natl Acad Sci U S A 106: 13624-13629
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Sullivan AM, Arsovski AA, Lempe J, Bubb KL, Weirauch MT, Sabo PJ, Sandstrom R, Thurman RE, Neph S, Reynolds AP (2014) Mappingand dynamics of regulatory DNA and transcription factor networks in A. thaliana. Cell reports 8: 2015-2030 www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from
Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Sun J, Jiang H, Xu Y, Li H, Wu X, Xie Q, Li C (2007) The CCCH-type zinc finger proteins AtSZF1 and AtSZF2 regulate salt stressresponses in Arabidopsis. Plant Cell Physiol 48: 1148-1158
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Tang W, Ji Q, Huang Y, Jiang Z, Bao M, Wang H, Lin R (2013) FAR-RED ELONGATED HYPOCOTYL3 and FAR-RED IMPAIREDRESPONSE1 transcription factors integrate light and abscisic acid signaling in Arabidopsis. Plant Physiol 163: 857-866
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Turatsinze JV, Thomas-Chollier M, Defrance M, van Helden J (2008) Using RSAT to scan genome sequences for transcription factorbinding sites and cis-regulatory modules. Nat Protoc 3: 1578-1588
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Van de Velde J, Heyndrickx KS, Vandepoele K (2014) Inference of transcriptional networks in Arabidopsis through conservednoncoding sequence analysis. The Plant Cell 26: 2729-2745
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Vandepoele K, Casneuf T, Van de Peer Y (2006) Identification of novel regulatory modules in dicotyledonous plants using expressiondata and comparative genomics. Genome Biol 7: R103
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Vandepoele K, Quimbaya M, Casneuf T, De Veylder L, Van de Peer Y (2009) Unraveling transcriptional control in Arabidopsis using cis-regulatory elements and coexpression networks. Plant Physiol 150: 535-546
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Vanderauwera S, Vandenbroucke K, Inze A, van de Cotte B, Muhlenbock P, De Rycke R, Naouar N, Van Gaever T, Van Montagu MC,Van Breusegem F (2012) AtWRKY15 perturbation abolishes the mitochondrial stress response that steers osmotic stress tolerance inArabidopsis. Proc Natl Acad Sci U S A 109: 20113-20118
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Varala K, Marshall-Colon A, Cirrone J, Brooks MD, Pasquino AV, Leran S, Mittal S, Rock TM, Edwards MB, Kim GJ, Ruffel S, McCombieWR, Shasha D, Coruzzi GM (2018) Temporal transcriptional logic of dynamic regulatory networks underlying nitrogen signaling and usein plants. Proc Natl Acad Sci U S A 115: 6494-6499
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Wang H, Deng XW (2002) Arabidopsis FHY3 defines a key phytochrome A signaling component directly interacting with its homologouspartner FAR1. EMBO J 21: 1339-1349
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Wang ZY, Tobin EM (1998) Constitutive expression of the CIRCADIAN CLOCK ASSOCIATED 1 (CCA1) gene disrupts circadian rhythmsand suppresses its own expression. Cell 93: 1207-1217
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Weirauch MT, Yang A, Albu M, Cote AG, Montenegro-Montero A, Drewe P, Najafabadi HS, Lambert SA, Mann I, Cook K, Zheng H, GoityA, van Bakel H, Lozano JC, Galli M, Lewsey MG, Huang E, Mukherjee T, Chen X, Reece-Hoyes JS, Govindarajan S, Shaulsky G,Walhout AJM, Bouget FY, Ratsch G, Larrondo LF, Ecker JR, Hughes TR (2014) Determination and inference of eukaryotic transcriptionfactor sequence specificity. Cell 158: 1431-1443
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Werner JD, Borevitz JO, Warthmann N, Trainer GT, Ecker JR, Chory J, Weigel D (2005) Quantitative trait locus mapping and DNA arrayhybridization identify an FLM deletion as a cause for natural flowering-time variation. Proc Natl Acad Sci U S A 102: 2460-2465
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Wilmoth JC, Wang S, Tiwari SB, Joshi AD, Hagen G, Guilfoyle TJ, Alonso JM, Ecker JR, Reed JW (2005) NPH4/ARF7 and ARF19 promoteleaf expansion and auxin-induced lateral root formation. Plant J 43: 118-130
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Xie Z, Li D, Wang L, Sack FD, Grotewold E (2010) Role of the stomatal development regulators FLP/MYB88 in abiotic stress responses.Plant J 64: 731-739
Pubmed: Author and Title www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Xu M, Hu T, Zhao J, Park MY, Earley KW, Wu G, Yang L, Poethig RS (2016) Developmental Functions of miR156-Regulated SQUAMOSAPROMOTER BINDING PROTEIN-LIKE (SPL) Genes in Arabidopsis thaliana. PLoS Genet 12: e1006263
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Yu CP, Chen SC, Chang YM, Liu WY, Lin HH, Lin JJ, Chen HJ, Lu YJ, Wu YH, Lu MY, Lu CH, Shih AC, Ku MS, Shiu SH, Wu SH, Li WH(2015) Transcriptome dynamics of developing maize leaves and genomewide prediction of cis elements and their cognatetranscription factors. Proc Natl Acad Sci U S A 112: E2477-2486
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Yu H, Xu Y, Tan EL, Kumar PP (2002) AGAMOUS-LIKE 24, a dosage-dependent mediator of the flowering signals. Proc Natl Acad Sci US A 99: 16336-16341
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
Zhang W, Zhang T, Wu Y, Jiang J (2012) Genome-wide identification of regulatory DNA elements and protein-binding footprints usingsignatures of open chromatin in Arabidopsis. The Plant Cell 24: 2719-2731
Pubmed: Author and TitleGoogle Scholar: Author Only Title Only Author and Title
www.plantphysiol.orgon July 9, 2020 - Published by Downloaded from Copyright © 2019 American Society of Plant Biologists. All rights reserved.