Analysis of Body-wide Unfractionated Tissue Data to Identify a … · profiling of 124 samples...
Transcript of Analysis of Body-wide Unfractionated Tissue Data to Identify a … · profiling of 124 samples...
Article
Analysis of Body-wide Unf
ractionated Tissue Data toIdentify a Core Human Endothelial TranscriptomeGraphical Abstract
Highlights
d RNA-seq of 124 unfractionated tissue samples from 32
different organs was analyzed
d Human pan-endothelial enriched transcripts across vascular
beds were identified
d Relative expression profile was maintained in early passage
cultured cells
d Analysis method is applicable to profile other body-wide
expressed cell types
Butler et al., 2016, Cell Systems 3, 1–15September 28, 2016 ª 2016 Elsevier Inc.http://dx.doi.org/10.1016/j.cels.2016.08.001
Authors
Lynn Marie Butler,
Bjorn Mikael Hallstrom,
Linn Fagerberg, Fredrik Ponten,
Mathias Uhlen, Thomas Renne,
Jacob Odeberg
In Brief
Butler et al. use RNA-seq data from 124
unfractionated tissue samples from 32
human organs to identify known and
previously unknown endothelial-specific
transcripts and provide a searchable
resource that can be used to determine
the ‘‘endothelial enrichment score’’ of any
human protein coding gene. In addition to
identifying potential vascular drug targets
or endothelial biomarkers, this study
provides a framework to determine the
specific transcriptome profiles of other
cell types distributed across multiple
organs.
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
Cell Systems
Article
Analysis of Body-wide Unfractionated Tissue Datato Identify a Core Human Endothelial TranscriptomeLynn Marie Butler,1,2,6,* Bjorn Mikael Hallstrom,3 Linn Fagerberg,3 Fredrik Ponten,4 Mathias Uhlen,3 Thomas Renne,1,2
and Jacob Odeberg3,5
1Institute for Clinical Chemistry and Laboratory Medicine, University Medical Centre Hamburg-Eppendorf, 20246 Hamburg, Germany2Clinical Chemistry and Blood Coagulation, Department of Molecular Medicine and Surgery, Karolinska Institute, 171 76 Stockholm, Sweden3Science for Life Laboratory, School of Biotechnology, Royal Institute of Technology (KTH), 171 21 Stockholm, Sweden4Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, 751 85 Uppsala, Sweden5Coagulation Unit, Centre for Hematology, Karolinska University Hospital, 171 76 Stockholm, Sweden6Lead Contact*Correspondence: [email protected]
http://dx.doi.org/10.1016/j.cels.2016.08.001
SUMMARY
Endothelial cells line blood vessels and regulate he-mostasis, inflammation, and blood pressure. Pro-teins critical for these specialized functions tend tobe predominantly expressed in endothelial cellsacross vascular beds. Here, we present a systemsapproach to identify a panel of human endothelial-enriched genes using global, body-wide transcrip-tomics data from 124 tissue samples from 32 organs.We identified known and unknown endothelial-en-riched gene transcripts and used antibody-basedprofiling to confirm expression across vascularbeds. The majority of identified transcripts could bedetected in cultured endothelial cells from variousvascular beds, and we observed maintenance ofrelative expression in early passage cells. In sum-mary, we describe a widely applicable method todetermine cell-type-specific transcriptome profilesin a whole-organism context, based on differentialabundance across tissues. We identify potentialvascular drug targets or endothelial biomarkers andhighlight candidates for functional studies to in-crease understanding of the endothelium in healthand disease.
INTRODUCTION
Endothelial cells (ECs) line the inside of all vessels and have a
critical role in the regulation of hemostasis, inflammation, de-
fense against blood borne pathogens, vascular tone, angiogen-
esis, and the transport of molecules and nutrients to and from the
blood stream (Pober and Sessa, 2007; Vita, 2011). The involve-
ment of ECs in multiple disease states, such as coronary artery
disease, venous thromboembolism, edema, and vasculitis is
well recognized (Ganz and Hsue, 2013; Mackman, 2012; Steyers
and Miller, 2014; Tabas et al., 2015). ECs from different vascular
beds can vary in their gene expression profile, reflecting organ-
specific functions and even morphologically similar ECs can
show differences in gene expression (Aird, 2012; Civelek et al.,
2011; Nolan et al., 2013; Seaman et al., 2007). Known genes
with largely EC-restricted expression across tissue beds are
important for vascular stability (Du Toit, 2015) or cell-specific
functions, for example, in inflammatory processes (Ley, 2003)
or hemostasis (Lenting et al., 2015).
Recently there have been significant technological advance-
ments in large-scale analysis of cellular gene expression profiles
(Spies and Ciaudo, 2015). As ECs are a minority cell type in a
given organ it is challenging to determine EC gene expression
profiles from averaged transcriptome analysis of whole-tissue
samples. Methodological advances, such as laser cell capture
(Cheng et al., 2013), enzymatic, or manual dissection and cell
sorting (Berger et al., 2012; Nolan et al., 2013) and immuno-pu-
rification (Wang and Navin, 2015) have allowed the isolation of
ECs from tissue prior to analysis. However, such processing
and/or subsequent in vitro culture can trigger changes in gene
expression, due to the loss of the organ-specific microenviron-
ment (Amaya et al., 2015; Balda and Matter, 2009; Durr et al.,
2004).
Here, as an illustrative application of the Human Protein
Atlas resource (Uhlen et al., 2015), we used a systems-based
approach to define the physiological human in vivo pan
endothelial-enriched gene expression profile using whole-
transcriptome analysis of unfractionated tissue samples. We
identify a panel of human pan EC-enriched transcripts and
replicate our findings using the same analysis protocol on
Genotype-Tissue Expression (GTEx) datasets. 118 of the
identified transcripts encode for novel or uncharacterized
EC proteins. We also provide a searchable resource that
can be used to determine the extent of pan endothelial spec-
ificity of any gene.
The identification of previously unknown EC-enriched genes
provides new vascular drug targets or biomarker candidates
and presents candidates for future studies to further increase
our understanding of EC function in health and disease.
RESULTS
We performed RNA sequencing (RNA-seq) tissue transcript
profiling of 124 samples collected from 32 human organs (n =
2–7 samples/organ) as part of the Human Protein Atlas Project
Cell Systems 3, 1–15, September 28, 2016 ª 2016 Elsevier Inc. 1
Figure 1. CLEC14A, vWF, and CD34 Tran-
script Quantities In Vivo Reflect the Degree
of Tissue Vascularization
(A) Mean FPKM values for c-type lectin domain
family 14, member A (CLEC14A), von Willebrand
factor (vWF), and CD34 (CD34) transcripts in bone
marrow, pancreas, ovary, tonsil, salivary gland,
appendix, spleen, thyroid gland, gallbladder, uri-
nary bladder, heart muscle, and lung; n = 2–5 in-
dividual samples/organ (see Table S1). Data are
mean ± SEM. Corresponding IHC images stained
with primary antibodies against CLEC14A, vWF,
and CD34 protein are shown on tissue sections
from ovary, appendix, gall bladder, and lung.
(B) Scatterplots showing correlations between
mean CLEC14A, vWF, and CD34 FPKM values
and the estimated mean EC percentage in the
sequenced sample, determined by histological
examination prior to processing. Tissue type rep-
resented by each symbol corresponds to that
indicated on the x axis of (A). Pearson correlation
and corresponding p values are shown in the top
left of each scatterplot.
See also Figure S1A. Scale bars, 100 mm.
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
(HPA; http://www.proteinatlas.org/) (Uhlen et al., 2015). Tissue
cryosections from selected organs (bone marrow, pancreas,
ovary, tonsil, salivary gland, appendix, spleen, thyroid gland,
gall bladder, urinary bladder, heart muscle, and lung) were
H&E stained to morphologically determine the percentage of
ECs, prior to RNA processing and sequencing from identical
samples (see Table S1 and Experimental Procedures for further
details). Fragments per kilobase of exon model per million map-
ped reads (FPKMs) values were calculated for 20,073 mapped
protein-coding genes in all 124 samples.
CLEC14A, vWF, and CD34 Transcript Quantities ReflectLevels of Tissue VascularizationWe selected three transcripts that encode for proteins that are
predominantly expressed in ECs across different vascular
beds; c-type lectin domain family 14, member A (CLEC14A)
(Rho et al., 2011), von Willebrand factor (vWF) (Zanetta et al.,
2 Cell Systems 3, 1–15, September 28, 2016
2000), and CD34 (CD34) (Muller et al.,
2002; Pusztaszeri et al., 2006). VWF has
long been acknowledged as an EC
marker in vivo (Zanetta et al., 2000), as
has CD34, although both reportedly
show some variation between tissue
beds and vessel types (Muller et al.,
2002; Pusztaszeri et al., 2006). CLEC14A
was originally described as an EC protein
in murine models (Rho et al., 2011), and it
was later described as a tumor angiogen-
esis marker with limited expression in
selected normal human tissues (Mura
et al., 2012; Noy et al., 2015). However,
immunohistochemistry (IHC) confirmed
enriched EC expression of all three
across vascular beds (examples shown
in Figure 1A). Mean FPKM values of
CLEC14A, vWF, and CD34 varied from <1–56, <1–110, and
4–166, respectively, across the 32 organs (Figure 1A). Although
absolute FPKM values differed, the relative expression of the
EC reference transcripts were strikingly similar, with highest
levels detected in highly vascularized organs, such as the heart,
lung, placenta, and adipose tissue and lowest in less vascular-
ized organs, such as pancreas and ovary (organs with accompa-
nying percentage EC data in Figure 1A, those without in Fig-
ure S1A.i). CLEC14A, vWF, and CD34 FPKM values were
strongly correlated with each other across individual samples
(correlation >0.74, p values <0.001) (Figure S1A.ii), supporting
the concept that combined CLEC14A, vWF, and CD34 expres-
sion provides a surrogate measurement for the relative degree
of tissue vascularization in vivo. Consistent with the expression
data, IHC revealed a high vascular content in tissues with high
CLEC14A, vWF, and CD34 FPKM values (Figure 1A). CLEC14A,
vWF, and CD34 expression correlated with percentage of EC in
Figure 2. Correlation Values between the
Reference Endothelial Cell Transcripts
CLEC14A, vWF, CD34 and Proteins Des-
cribed as EC Enriched in the Literature
(A) RNA-seq data from 124 individual samples
from 32 different human tissue types were used to
generate Spearman pair wise correlation values
between the EC reference transcripts CLEC14A,
vWF, and CD34 and transcripts reported in the
literature as EC enriched.
(B) IHC images of salivary gland, gallbladder, and
lung tissue stained for proteins encoded by
HSPA12B, PECAM1, ENG, ESM1, LIPG, and
EDF1. Corresponding scatterplots (right) show the
correlation betweenmean FPKM values andmean
EC percentage in selected sequenced tissue
samples. Tissue type represented by each symbol
corresponds to that indicated on the x axis of
Figure 1A. Pearson correlations and correspond-
ing p values are shown for each scatterplot. Scale
bars, 50 mm.
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
the corresponding sequenced tissue samples (correlation 0.82,
p value 0.001; correlation 0.90, p value <0.0001 and correlation
0.80 p value 0.002, respectively) (Figure 1B).
C
Cross-Tissue Correlation AnalysisCan Be Used to Identify EC-Enriched TranscriptsWe performed a bioinformatics analysis
of the RNA-seq tissue transcript profiling
data across the 32 organ types to pro-
duce correlation coefficient values be-
tween CLEC14A, vWF, and CD34 FKPM
values and those of the other 20,073
mapped protein-coding genes. A high
correlation value with all three EC refer-
ence genes should indicate EC-enriched
expression of the gene(s) in question
across tissue types. To test this method
for identification of EC-enriched tran-
scripts, we generated a list of 26 genes
widely considered as EC enriched, based
on published data (Ballabio et al., 2004;
Bernat et al., 2006; Ho et al., 2003; Humi-
niecki and Bicknell, 2000; Jaye et al.,
1999; Korhonen et al., 1995; Steagall
et al., 2006) and analyzed the correlation
between the FPKM values for these tran-
scripts and CLEC14A, vWF, and CD34
(Figure 2A). 20/26 selected genes had a
high mean correlation coefficient with
CLEC14A, vWF, and CD34 >0.5 (15/25
correlation >0.6). However, FPKM values
for endothelial-specific molecule (ESM1),
endothelial lipase (LIPG), and endothelial
differentiation-related factor 1 (EDF1)
failed to correlate with CLEC14A, vWF,
or CD34 FPKM values (correlation 0.04,
�0.02, and �0.09, respectively), sug-
gesting misclassification of these genes
as pan EC enriched. Consistent with this hypothesis, IHC for
transcripts with high-correlation coefficients, e.g., HSPA12B,
PECAM1, and ENG (mean correlation 0.73, p value <0.001,
ell Systems 3, 1–15, September 28, 2016 3
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
0.67 p value <0.001 and 0.59, p value <0.001, respectively)
confirmed EC-enriched expression, while IHC for ESM1, LIPG,
and EDF1 did not (selected organs representing low [salivary
gland], medium [gallbladder], and high [lung] EC percentage by
histology are shown as representative examples in Figure 2B).
Furthermore, mean FPKM values for HSPA12B, PECAM1, and
ENG showed a correlation with the estimated mean percentage
of ECs in bone marrow, pancreas, ovary, tonsil, salivary gland,
appendix, spleen, thyroid gland, gall bladder, urinary bladder,
heart muscle, and lung (correlation 0.60, p value 0.04, 0.53 p
value 0.07, 0.75, p value 0.005, respectively), while such correla-
tion was absent for ESM1, LIPG, and EDF1 (correlation 0.43, p
value 0.16; correlation 0.23, p value 0.47 and 0.05, p value
0.86, respectively) (Figure 2B, right). Based on this correlation
data, we defined ‘‘EC-enriched genes’’ as those that had signif-
icant mean correlation coefficients with the EC reference tran-
scripts CLEC14A, vWF, and CD34 >0.5.
Exclusion of Possible False Positives Due to thePresence of Other Cell TypesThe EC reference transcripts CD34 and vWF are also expressed
in hematopoietic stem cells (Satterthwaite et al., 1992) and
platelets (Kanaji et al., 2012; Schick et al., 1997), respectively,
raising the concern that transcripts from vasculature associated
hematopoietic cells could be incorrectly classified as EC en-
riched. Protein tyrosine phosphatase, receptor type, C (PTPRC)
(commonly known as CD45), a differentiated hematopoietic cell
marker, was predominantly expressed in the lymph node, tonsil,
appendix, spleen, and bone marrow (Figure S1B.i) and the
platelet protein integrin, alpha 2b (platelet glycoprotein IIb of
IIb/IIIa complex, antigen CD41) (ITGA2B) was expressed mainly
in bone marrow (Figure S1C.i). PTPRC and ITGA2B showed no
significant correlation with the EC reference seeds (correlation
with CLEC14A, vWF, and CD34: 0.01, 0.11, �0.11 and 0.11,
0.19, 0.25, respectively) (Figures S2A and S2B), arguing against
potential misclassification of transcripts expressed by circu-
lating blood cells as EC enriched. As vascular smooth muscle
cells (SMCs) surround vessels, we assessed whether transcripts
from this cell type could be incorrectly classified as EC enriched.
The SMC marker protein myosin, heavy chain 11, smooth mus-
cle (MYH11) was most highly expressed in smooth muscle tissue
and esophagus (Figure S1D.i). MYH11 expression did show a
significant, albeit weak, correlation with the three EC reference
transcripts (correlation CLEC14A, vWF, and CD34: 0.41, 0.36,
0.41, respectively) (Figure S2C), indicating further analysis
was required to determine whether any SMC transcripts were
falsely annotated as EC-enriched transcripts. We found no asso-
ciation between the mean percentage of ECs in bone marrow,
pancreas, ovary, tonsil, salivary gland, appendix, spleen, thyroid
gland, gall bladder, urinary bladder, heart muscle, and lung
and the mean FPKM value for PTPRC, ITGA2B, or MYH11 (cor-
relation �0.30, �0.29, and �0.05, respectively; Figures S1B.ii,
S1C.ii, and S1D.ii).
Sensitivity and Specificity AnalysisTo test the sensitivity and specificity of our method for identifica-
tion of EC-enriched transcripts and to determine optimal anal-
ysis criteria, we compared correlation coefficients between the
EC reference genes, CLEC14A, vWF, and CD34, and four sets
4 Cell Systems 3, 1–15, September 28, 2016
of transcripts categorized as: (1) ‘‘previously known EC en-
riched’’ (Ballabio et al., 2004; Bernat et al., 2006; Ho et al.,
2003; Huminiecki and Bicknell, 2000; Jaye et al., 1999; Korhonen
et al., 1995) (as featured in Figure 2A, with the exclusion of ESM1,
LIPG, and EDF1 due to lack of evidence of EC expression) (2)
‘‘non-EC expressed’’ (no expression in cultured EC, no evidence
of EC staining in vivo by IHC and expression in at least 20 of
the 32 organs sequenced, see Table S2, tab 2 for details) (3)
‘‘smooth muscle cell (SMC) enriched’’ (Conley, 2001; Dreiza
et al., 2010; Long et al., 2009; Miwa et al., 1991; Rensen et al.,
2007; Wang et al., 2003; Yamawaki et al., 2001), or (4) ‘‘macro-
phage enriched’’ (East and Isacke, 2002; Fabriek et al., 2009;
Kaufmann et al., 2001; Kunjathoor et al., 2002; Liang and Tedder,
2001; Murray and Wynn, 2011; Varchetta et al., 2012) (Table S2,
tabs 1–4: column A). 15/23 (65%) of the previously known EC-
enriched transcripts had mean correlation values >0.6 with our
EC reference transcripts, which increased to 20/23 (87%)
when the cutoff point was lowered toR0.5 (Table S2, tab 1, sec-
tion A; Figure S3A.i). In contrast, all 50 ‘‘non-EC transcripts’’ had
mean correlation values <0.3 with the EC reference transcripts
(mean�0.01, SD 0.19) (Table S2, tab 2, section A; Figure S3A.ii).
9/12 (75%) ‘‘SMC-enriched’’ transcripts had correlation values
<0.5 with the EC reference transcripts (mean 0.40, SD 0.06),
but 3/12 (25%) had correlation values >0.5 (mean 0.52, SD
0.01), indicating a 25% rate of false classification of SMC genes
as EC enriched (Table S2, tab 3, section A; Figure S3A.iii).
All ‘‘macrophage-enriched’’ transcripts had correlation values
<0.38with the EC reference transcripts (mean 0.22, SD 0.11) (Ta-
ble S2, tab 4, section A; Figure S3A.iv). Overall 67/70 (96%) of the
non EC-enriched transcripts had a correlation coefficient with
the EC reference transcripts of <0.5, and the three others were
all from the SMC-enriched category. To determine whether
such false positives could be identified, we performed an addi-
tional analysis to measure mean correlation coefficient values
between three selected SMC reference transcripts Myosin,
Heavy Chain 11, Smooth Muscle (MYH11), Myosin Light Chain
Kinase (MYLK), and Actin, Alpha 2, Smooth Muscle, Aorta
(ACTA2) and those in the ‘‘previously known EC-enriched’’ and
the SMC-enriched test set (Table S2, tabs 1 and 3, section C).
The previously known EC-enriched transcripts had higher
mean correlation values with the EC reference transcripts,
CLEC14A, vWF, and CD34, than with the SMC reference tran-
scripts, MYH11, MYLK, and ACTA2 (mean 0.64 versus 0.33,
respectively) (Table S2, tab 1, section A versus section C), while
the reverse was true for the SMC-enriched transcripts (mean
0.76 versus 0.43) (Table S2, tab 3, section A versus section C).
Based on these analyses, we defined EC-enriched genes as
those that had statistically significant mean correlation coeffi-
cients R0.5 with the EC reference transcripts CLEC14A, vWF,
and CD34, with, on a transcript-to-transcript basis, lower corre-
lation values with the SMC reference transcripts,MYH11,MYLK,
and ACTA2. These criteria minimized the risk of false negatives
and positives. Correlation values of each identified gene with
SMC reference transcripts is provided in Table S3, tab 1, col-
umns AC–AE.
Identification of EC-Enriched GenesFrom 20,073 mapped protein-coding genes 481 transcripts
had mean correlation coefficients R0.5 with the EC reference
Figure 3. Summary of Endothelial Cell Reference Transcript Correlation Analysis Data
RNA-seq data from 124 individual samples from 32 different human tissue types were used to generate pairwise correlation values between the EC reference
transcripts CLEC14A, vWF, and CD34 and the other 20,073 detectable protein-coding genes.
(A) 234 transcripts were identified as EC enriched and categorized as known (previously reported as EC expressed), unknown (not reported as EC expressed), or
uncharacterized. The ten most highly correlated in each category are displayed (p < 0.001 in all cases).
(B) Scatterplots showing the correlation between mean FPKM values for selected genes from each category and the mean EC percentage in the sequenced
tissue sample, determined by histological examination prior to processing. Tissue type represented by each symbol corresponds to that indicated on the x axis of
Figure 1A. Pearson correlation and corresponding p values are shown in the top left of each scatterplot.
See also Table S3, tab 1.
Cell Systems 3, 1–15, September 28, 2016 5
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
(legend on next page)
6 Cell Systems 3, 1–15, September 28, 2016
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
transcripts CLEC14A, vWF, and CD34. 373/481 (77.5%) had
higher individual correlation coefficients with the EC reference
transcripts compared to the SMC transcripts MYH11, MYLK,
and ACTA2, and the remainder (23%) were excluded (corre-
sponding approximately to the predicted false positive rate,
25%). Of these, 332 (89.1%) had a Bonferroni corrected p
value <0.05 and FDR <0.0001. Individual and mean correlation
values between CLEC14A, vWF, and CD34 and all detected
protein-coding genes are provided in Table S3, tab 2.
Replication of Results Using GTEx DatasetsTo confirm our results with an independent approach, we used
RNA-seq data from the Genotype-Tissue Project (GTEx) portal
(http://www.gtexportal.org/home/) (Ardlie et al., 2015) from 24
human organs (4470 samples) (Table S2, tab 6) to replicate
our analysis. We repeated the sensitivity and specificity anal-
ysis, as for the HPA material. 20/22 (91%) of previously known
EC-enriched transcripts had mean correlation values >0.5 with
the EC reference transcripts (mean correlation 0.71, SD 0.16)
(Table S2, tab 1, section B; Figure S3B.i). All 50 non-EC tran-
scripts had mean correlation values <0.5 with the EC reference
transcripts (mean correlation 0.08, SD 0.27) (Table S2, tab 2,
section B; Figure S3B.ii). 10/12 (83%) SMC-enriched tran-
scripts had correlation values <0.5 (mean correlation 0.39, SD
0.07), but 2/12 (17%) had correlation values R0.5 (mean corre-
lation 0.51, SD 0.01) (Table S2, tab 3, section B; Figure S3B.iii),
indicating a lower false positive rate than in the HPA material
(17% versus 25%). 1/8 (12.5%) macrophage-enriched tran-
scripts had a correlation value >0.5 (mean correlation 0.43,
SD 0.06) (Table S2, tab 4, section B; Figure S3B.iv), revealing
a higher false positive rate in this category than for the HPA ma-
terial. GTEx and HPA values for all test transcripts highly corre-
lated with each other (Figure S3C) (correlation 0.84 p value
<0.0001). Based on this analysis, we defined the requirement
for replication as a mean correlation coefficient R0.5 with the
EC reference transcripts CLEC14A, vWF, and CD34 in the
GTEx material for each HPA-identified EC-enriched transcript.
233/332 (70%) of the EC-enriched genes determined from the
HPA material were replicated in the GTEx material. This final
list contained 82 of the 100 transcripts most highly correlated
with the EC reference seeds in the GTEx material (Table S3,
tab 3). A summary of the selection protocol is shown in
Figure S3D.
The 234 transcripts included 116 previously described in an
EC context, 88 that had not been previously been associated
with EC and 30 protein-coding genes on which knowledge is
sparse or entirely absent (Table S3, tab 1, column Q). The
ten most highly correlated genes in each category are detailed
in Figure 3A. Example correlation plots of the transcript FPKM
values versus the mean estimated percentage of EC in seq-
uenced samples are shown for three genes from each category
(correlation range 0.91–0.54, p value range <0.001–0.06) (Fig-
Figure 4. EH-Domain Containing 2 Is a Pan Endothelial-Enriched Prote
(A) IHC staining of multiple tissue types using a primary antibody targeting EHD2
(B.i.) Plotted mean FPKM values for von Willebrand factor (vWF) and EHD2 trans
represented as mean ± SEM. Corresponding IHC images from liver, kidney, adre
(B.ii.) Staining for EHD2 in (1) veins, (2) venules, and (3) capillaries of the heart m
Data are represented as mean ± SEM. See also Figure S2D.
ure 3B). Gene ontology (GO) analysis (Ashburner et al., 2000)
was performed on the final list of EC-enriched transcripts.
The most significant biological process groupings were all
related to EC function (vasculature development, blood vessel
development, angiogenesis, circulatory system development,
cardiovascular system development [corrected p value for
all <2.2 3 1033]), with numerous other endothelial related
groupings also identified, e.g., endothelial development, regu-
lation of endothelial cell migration, positive regulation of endo-
thelial cell migration, endothelial cell differentiation, vascular
endothelial growth factor signaling pathway, etc. (Table S3,
tab 4).
Protein Profiling of Novel EC-Enriched Genes In VivoWe selected three genes identified as EC enriched from
the unknown or uncharacterized category, that had varying
levels of expression in primary cultured EC in vitro (see
following section and Table S3, column AI) for antibody-based
protein profiling. EH-domain containing 2 (EHD2; mean corre-
lation 0.72, p value <0.001) and LIM and senescent cell anti-
gen-like domains 2 (LIMS2; mean correlation 0.69, p value
<0.001) (Table S3, tab 1 and Figures S2D and S2E, respec-
tively), both from the unknown category and family with
sequence similarity 110, member D (FAM110D; mean correla-
tion 0.65, p value <0.001) (Table S3, tab 1 and Figure S2F)
from the uncharacterized category, are predominantly ex-
pressed in ECs, e.g., stomach, skin, cerebral cortex, esoph-
agus, gallbladder, urinary bladder, placenta, breast, naso-
pharynx, heart, and lung (Figures 4A, 5A, and 6A). EDH2 and
LIMS2 FPKM expression levels both paralleled that of vWF
(Figures 4B and 5B) (variation by ANOVA for organ type =
72.9%, gene = 0.4% and organ type = 62.5%, gene =
0.18%, respectively). However, higher relative levels of EDH2
versus vWF were observed in liver, kidney, adrenal gland,
and ovary (Figure 4B.i, dotted boxes). IHC staining showed
that EHD2 was expressed in liver, kidney, and adrenal gland
EC, while vWF was largely absent. Ovarian ECs were positive
for EHD2 and vWF, with some EHD2 positivity in ovarian stro-
mal cells. Higher relative levels of LIMS2 versus vWF expres-
sion were observed in the liver, kidney, small intestine, and
prostate (Figure 5B.i, dotted boxes). LIMS2 staining was
stronger than vWF in liver, kidney, and small intestine ECs
(Figure 5B.i). Prostate ECs were positive for both, with some
weak LIMS2 staining in prostate smooth muscle cells (Fig-
ure 5B.i). Unlike EDH2 and LIMS2, mean FPKM values for
FAM110D were significantly lower than vWF (Figure 6B) (vari-
ation by ANOVA for organ type = 32.4%, gene = 19.8%).
FAM110D FPKM values, relative to vWF, were highest in the
kidney (second dotted box, Figure 6B.i). IHC confirmed
FAM110D expression in kidney ECs, but some positive stain-
ing was observed outside the EC compartment (Figure 6B.i).
EHD2, LIMS2, and FAM110D protein staining was observed
in In Vivo
. Scale bars, 100 mm.
cripts in 124 individual samples from 32 different human tissue types. Data are
nal gland, and ovary (denoted by dotted boxes) are displayed above.
uscle. Scale bars, 50 mm.
Cell Systems 3, 1–15, September 28, 2016 7
(legend on next page)
8 Cell Systems 3, 1–15, September 28, 2016
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
in different vessel types (illustrated in heart muscle; Figures
4B.ii, 5.B.ii, and 6.B.ii, respectively).
Expression of In Vivo Pan Endothelial-Enriched Genes inIn Vitro Cell-Culture SystemsWe performed RNA-seq on first passage human umbilical vein
ECs (HUVECs), freshly isolated from four independent individ-
uals, to analyze the expression of our identified EC-enriched
transcripts on a cellular level. 56% of all mapped protein-coding
genes were expressed (11,292 transcripts with FPKM number
R1). 80% of these transcripts were housekeeping genes that
we have previously reported as widely expressed in all tissues
(Uhlen et al., 2015). Of the 234 EC-enriched transcripts identified
in vivo, 196were detectable with a cut off FPKMofR1 (Table S3,
tab 1, column AI; Figure 7A). We also analyzed expression of the
234 genes in human ECs isolated from other vascular beds;
bladder microvasculature (HBMEC), iliac artery (HIAEC), saphe-
nous vein (HSVEC), umbilical artery (HUAEC), uterine microvas-
cular (HUtMEC), using microarray expression data (http://www.
ncbi.nlm.nih.gov/geo/). The normalized expression values are
not strictly comparable to the RNA-seq FPKM values from
HUVEC, as they were generated using an alternative technology
platform. However, 161 of the pan EC-enriched transcripts iden-
tified by our analysis were detected in one or more of these EC
types (Table S3, tab 1, columns AM–AQ), while only 20measured
transcripts were not detectable in any of the cultured cells. 16 of
these belonged to the unknown or uncharacterized category,
which could explain the lack of acknowledgment of these tran-
scripts as endothelial, due to the heavy historical reliance on
in vitro work to characterize EC gene expression profiles. To
determine whether the relative in vivo expression levels of the
234 EC-enriched genes (to each other) were maintained in
cultured ECs, we calculated the mean FPKM for each transcript
across the 124 sequenced human tissues to produce a relative
expression score (Table S3, tab 1, column AH) and analyzed
the correlation between these scores and the respective tran-
script expression level in cultured ECs. We observed a positive
relationship (correlation 0.50, p value <0.0001) between the rela-
tive in vivo expression scores of the known, unknown, and un-
characterized transcripts and the corresponding transcript levels
in cultured HUVEC (Figure 7A.ii). In contrast, correlations were
absent in HBMEC, HIAEC, HSVEC, HUAEC, and HUtMEC (cor-
relation 0.03, 0.01, 0.04, 0.03, and 0.02, respectively). Taken
together, these results show that most pan EC-enriched genes
expressed in vivo can be detected in cultured ECs, and the rela-
tive expression levels are maintained in first passage HUVEC.
GIPC3 and KANK3, representative transcripts from the unknown
and uncharacterized EC-enriched categories that were not de-
tected in vitro, were confirmed as ECs expressed in vivo, e.g.,
stomach, liver, cerebral cortex, and kidney (Figure 7B). This anal-
ysis provides information regarding the suitability of in vitro sys-
tems to study EC-enriched gene expression and function.
Figure 5. LIM and Senescent Cell Antigen-like Domains 2 Is Pan Endo
(A) IHC staining of multiple tissue types using a primary antibody targeting LIMS
(B.i.) Plotted mean FPKM values for von Willebrand factor (vWF) and LIMS2 trans
represented as mean ± SEM. Corresponding IHC images from liver, kidney, sma
(B.ii.) Staining for LIMS2 expression in (1) veins, (2) venules, and (3) capillaries of
Data are represented as mean ± SEM. See also Figure S2E.
DISCUSSION
Here, we use an integrative transcriptomics and antibody-based
profiling approach to identify human EC-enriched proteins. We
provide a searchable resource (Table S3), which can be used
to determine the extent of pan EC specificity of any mapped
gene. For example, there are three types of NOS, an enzyme
involved in the synthesis of nitric oxide from L-arginine (Forster-
mann and Sessa, 2012), neuronal nNOS (NOS1) (Zhou and Zhu,
2009), cytokine-inducible iNOS (NOS2) (Bogdan, 2015), and
endothelial eNOS (NOS3) (Oliveira-Paula et al., 2016). Surpris-
ingly, there is no description of the extent of pan EC specificity
of eNOS in humans, but our data suggestNOS3 is predominantly
EC enriched (correlation 0.50), in contrast to NOS1 and NOS2,
(correlation �0.04 and 0.12, respectively). This resource can
be used together with our HPA tissue protein profiling data
(http://www.proteinatlas.org/) (Uhlen et al., 2015).
Our study has some limitations; we observe EC expression for
the majority of EC-enriched genes identified but could not
confirm all, due to lack of antibody specificity or other technical
issues, e.g., staining for secreted proteins. HPA antibody reli-
ability guidelines are available, with assessment of concordance
between RNA-seq and IHC staining, detection by western blot
and protein array specificity analysis. We minimized incorrect
classification of SMCs or macrophage genes as EC enriched,
but it was not possible to do such analysis for pericytes, due
to the lack of specific markers (Armulik et al., 2011). However,
pericytes are present in different relative quantities to ECs across
vascular beds (Dıaz-Flores et al., 2009), so pericyte-specific
genes, if they exist, should not correlate with our EC reference
transcripts. By analyzing samples from a large number of organs,
but few from each, the term ‘‘pan EC-transcriptome’’ needs to be
interpreted with caution; a lack of EC expression in some tissue
beds is not incompatible with a high correlation value. Examina-
tion of correlation plots to identify tissue-specific outliers, anal-
ysis of larger datasets (as we did using GTEx material) or IHC
confirmation could address this. EC reference transcripts are
not uniformly expressed across all vascular beds, e.g., low
vWF in liver ECs, however, identified EC-enriched genes were
detectable in EC of such ‘‘outlier’’ organs.
Previous efforts to determine human EC-enriched transcripts
have used isolated/cultured cells (Chi et al., 2003; Ho et al.,
2003), and those confirmed as pan EC-enriched are often critical
for EC function, e.g.,CDH5 (Carmeliet et al., 1999), FLT1 (Li et al.,
2015), and vWF (Lenting et al., 2015). There are many studies of
EC gene expression during normal and pathological angiogen-
esis (Seaman et al., 2007; Seano et al., 2014; van Beijnum
et al., 2006). Many genes we identified have a role in angiogen-
esis induction and vessel stability, e.g., RAMP2 (Ichikawa-
Shindo et al., 2008), RRAS (Sawada et al., 2012), ADGRL4 (Ma-
siero et al., 2013), and RHOJ (Kim et al., 2014). EC turnover in
adult tissues is over years, rather than months or weeks (Hobson
thelial-Enriched Protein In Vivo
2. Scale bars, 100 mm.
cripts in 124 individual samples from 32 different human tissue types. Data are
ll intestine, and prostate (denoted by dotted boxes) are displayed above.
the heart muscle. Scale bars, 50 mm.
Cell Systems 3, 1–15, September 28, 2016 9
(legend on next page)
10 Cell Systems 3, 1–15, September 28, 2016
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
Figure 7. Pan EC-Enriched Transcript Ex-
pression in Cultured ECs
(A.i) RNA-seq data from first passage primary
umbilical vein endothelial cells (HUVECs) were
used to identify the proportion of detectable
known, unknown, and uncharacterized pan EC-
enriched transcripts (FPKM R1). (A.ii) RNA-seq
data from 124 individual samples from 32 different
human tissue types were used to calculate a mean
FPKM expression value for each pan EC-enriched
transcript, which was plotted against the corre-
sponding mean transcript expression in HUVEC
(n = 4). Green, red, and black points represent
known, unknown, and uncharacterized tran-
scripts, respectively. Pearson correlations and
corresponding p values are shown in the lower
right of each plot.
(B) Stomach, liver, cerebral cortex, and liver
tissue sections stained for proteins encoded by
GIPC3 and KANK3, pan EC-enriched transcripts
that could not be detected in first passage
HUVEC (unknown and uncharacterized category,
respectively).
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
and Denekamp, 1984), so constitutive expression of these pro-
angiogenic genes unlikely indicates active angiogenesis. We
also identified angiogenesis inhibitory genes, e.g., NOTCH4
(Leong et al., 2002), FGD5 (Cheng et al., 2012), MMRN2 (Loren-
zon et al., 2012), and DLL4 (Liu et al., 2014). Thus, we speculate
that pro- and anti-angiogenic gene expression maintains a ho-
meostatic balance in the absence of external cues. Indeed, neo-
vascularization is associated with baseline gene expression
modulation (Benedito et al., 2012; Shih et al., 2002), rather than
absolute induction or suppression of specific transcripts.
We identified 88 EC-enriched genes encoding for previously
unknown or uncharacterized EC proteins; nine at very low tran-
script levels (mean expression across tissues <FPMK 2), e.g.,
FAM110D. Previous studies of human EC transcript expression
have involved EC isolation (Bhasin et al., 2010; Chu and Peters,
Figure 6. Family with Sequence Similarity 110, Member D Is a Pan EC-Enriched Protein In Vivo
(A) IHC staining of multiple tissue types using a primary antibody targeting FAM110D. Scale bars, 100 mm.
(B.i.) Mean FPKM values for von Willebrand factor (vWF) and FAM110D transcripts in 124 individual sampl
represented as mean ± SEM. Corresponding IHC images from liver, kidney, skeletal muscle, and ovary (den
(B.ii.) Staining for FAM110D in (1) veins, (2) arterioles, and (3) capillaries of the heart muscle. Scale bars, 50
Data are represented as mean ± SEM. See also Figure S2F.
Ce
2008; Seaman et al., 2007; Urich et al.,
2012), meaning results lack whole-organ-
ism context. Thus, lowly expressed (but
highly enriched) EC genes may have
been overlooked. Recent studies high-
light the complex relationship between
mRNA transcription and protein pro-
duction, finding that the two correlate
but not always strongly (Vogel and Mar-
cotte, 2012). Protein levels are primarily
determined by translation rates, followed
by transcription rates (Schwanhausser
et al., 2011), with gene-specific dy-
namics; e.g., metabolic genes can have
high protein to mRNA ratios (Vogel et al., 2010), while those
involved in transcriptional regulation can exhibit opposite traits
(Schwanhausser et al., 2011). Although FAM110D FPKM values
were <1 in most organs, IHC confirmed EC expression,
endorsing an integrative transcriptomics and antibody-based
profiling approach. Pan EC-enriched transcripts expressed at
low levels in the ‘‘baseline’’ state could be increased during
inflammation or angiogenesis, as for other EC genes, e.g.,
VEGFC (Semenza, 2001) and SELE (Collins et al., 1995).
We identified a number of EC-enriched proteins with unde-
fined in vivo function. Ectopic expression of EHD2, an ATPase
(Stoeber et al., 2012), revealed a role in the regulation of caveolin
1 (CAV1) carrier state via actin filament interaction (Moren et al.,
2012; Stoeber et al., 2012). CAV1 was also identified as pan
EC enriched; thus, one could speculate that EHD2 has a role in
es from 32 different human tissue types. Data are
oted by dotted boxes) are displayed above.
mm.
ll Systems 3, 1–15, September 28, 2016 11
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
EC-specific caveolin function in vivo. In cell lines, LIMS2 local-
ized to focal adhesions and complexed with integrin-linked ki-
nase (ILK) (Zhang et al., 2002), where it negatively regulated
the interaction with LIMS1, a regulator of actin cytoskeletal
arrangement (Zhang et al., 2002). LIMS2 could be involved in
EC-specific regulation of LIMS1, and have a possible role in
angiogenesis, where shape changes and migration processes
are critical (Seano et al., 2014). There are currently no studies
on FAM110D (Figure 6), but its paralog genes, FAM110A-C,
localize to centrosomes and spindle poles, with a possible role
in cell-cycle regulation (Hauge et al., 2007).
Most identified pan EC-enriched genes were detected in
cultured ECs. Some transcripts encoding for well-described
EC proteins, e.g., PDGFRB (Beitz et al., 1991), GPIHBP1 (Pei-
Ling Chiu et al., 2014), HSPA12B (Hu et al., 2006), and PEAR1
(Nanda et al., 2005), were lost in vitro. Other undetectable genes
were unknown or uncharacterized; e.g.,GIPC3 and KANK3 both
were confirmed as EC enriched in vivo. Although cultured
venous and arterial ECs can retain distinct gene expression dif-
ferences, even after multiple passages (Chi et al., 2003), the
absence of microenvironment cues, e.g., shear stress, can
modulate mRNA levels (Amaya et al., 2015) and stability (Wu
et al., 2011), inducing a rapid gene expression drift (Durr et al.,
2004; Lacorre et al., 2004). Thus, the loss of pan EC-enriched
gene expression in vitro could be due to culture and/or repeated
passage, as for CD34 (Delia et al., 1993).
Bioinformatic algorithms have been proposed for deconvo-
luting gene expression data from heterogeneous tissues into
cell type-specific gene expression profiles, i.e., in silico micro
dissection (Gaujoux and Seoighe, 2013; Ju et al., 2013; Shen-
Orr et al., 2010). Here, we present a direct method to identify
the transcriptome profile of a low abundance cell type, EC,
from heterogeneous tissue samples. The transcriptome of other
cell types could be analyzed using the same principle, if present
across multiple tissue beds and expressing specific markers,
e.g., mast cells or resident macrophages. The pan EC-enriched
transcriptome generated here provides a basis for in-depth func-
tional studies to expand our knowledge of the vascular system in
health and disease.
STAR+METHODS
Detailed methods are provided in the online version of this paper
and include the following:
d KEY RESOURCES TABLE
d CONTACT FOR REAGENT AND RESOURCE SHARING
d EXPERIMENTAL MODEL AND SUBJECT DETAILS
d METHOD DETAILS
12
B Human Tissue Preparation and Transcript Profiling
B Estimation of Endothelial Cell Percentage in Selected
Tissues
B Tissue Profiling: Human Tissue Sections
B Transcript Profiling: Isolated Human Endothelial Cells
d QUANTIFICATION AND STATISTICAL ANALYSIS
B Analysis of RNA-Seq Data to Determine Pan EC-En-
riched Transcripts
B Analysis of GTEx RNA-Seq Data as Replication
B Gene Ontology (GO) Enrichment Analysis
Cell Systems 3, 1–15, September 28, 2016
d DATA AND SOFTWARE AVAILABILITY
d ADDITIONAL RESOURCES
SUPPLEMENTAL INFORMATION
Supplemental Information includes three figures and three tables and can be
found with this article online at http://dx.doi.org/10.1016/j.cels.2016.08.001.
AUTHOR CONTRIBUTIONS
Conceptualization, L.M.B. and J.O.; Methodology, L.M.B. and J.O.; Formal
Analysis, L.F. and B.M.H.; Investigation, L.M.B. and J.O.; Resources, F.P.
and M.U.; Writing – Original Draft, L.M.B. and J.O.; Writing – Review & Editing,
all; Visualization, L.M.B., T.R., and J.O.; Funding Acquisition, L.M.B., M.U.,
F.P., T.R., and J.O.
ACKNOWLEDGMENTS
We acknowledge the staff of the Human Protein Atlas (HPA) program, the Sci-
ence for Life Laboratory, and the pathology team in Mumbai, India. We thank
the Department of Pathology at the Uppsala Akademiska Hospital, Uppsala,
Sweden and Uppsala Biobank for kindly providing specimens used in this
study. The HPA was funded by Knut & Alice Wallenberg Foundation. We
also acknowledge funding from Hjart Lungfonden (20140691 and 20150623)
and Vetenskapsradet (2013-42608-102305-28) to L.M.B., Stockholm Council
(LS 1302-0311) to J.O. and Vetenskapsradet (K2013-65X-21462-04-5),
German Research Society (SFB841, SFB877), and a European Research
Council grant (ERC-StG-2012-311575_F-12) to T.R.
Received: December 1, 2015
Revised: May 23, 2016
Accepted: August 3, 2016
Published: September 15, 2016
REFERENCES
Aird, W.C. (2012). Endothelial cell heterogeneity. Cold Spring Harb Perspect
Med 2, a006429.
Amaya, R., Pierides, A., and Tarbell, J.M. (2015). The interaction between fluid
wall shear stress and solid circumferential strain affects endothelial gene
expression. PLoS ONE 10, e0129952.
Ardlie, K.G., DeLuca, D.S., Segre, A.V., Sullivan, T.J., Young, T.R., Gelfand,
E.T., Trowbridge, C.A., Maller, J.B., Tukiainen, T., Lek, M., et al.; GTEx
Consortium (2015). Human genomics. The Genotype-Tissue Expression
(GTEx) pilot analysis: Multitissue gene regulation in humans. Science 348,
648–660.
Armulik, A., Genove, G., and Betsholtz, C. (2011). Pericytes: Developmental,
physiological, and pathological perspectives, problems, and promises. Dev.
Cell 21, 193–215.
Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M.,
Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., et al.; The Gene Ontology
Consortium (2000). Gene ontology: Tool for the unification of biology. Nat.
Genet. 25, 25–29.
Balda, M.S., and Matter, K. (2009). Tight junctions and the regulation of gene
expression. Biochim. Biophys. Acta 1788, 761–767.
Ballabio, E., Mariotti, M., De Benedictis, L., and Maier, J.A.M. (2004). The dual
role of endothelial differentiation-related factor-1 in the cytosol and nucleus:
Modulation by protein kinase A. Cell. Mol. Life Sci. 61, 1069–1074.
Beitz, J.G., Kim, I.S., Calabresi, P., and Frackelton, A.R., Jr. (1991). Human
microvascular endothelial cells express receptors for platelet-derived growth
factor. Proc. Natl. Acad. Sci. USA 88, 2021–2025.
Benedito, R., Rocha, S.F., Woeste, M., Zamykal, M., Radtke, F., Casanovas,
O., Duarte, A., Pytowski, B., and Adams, R.H. (2012). Notch-dependent
VEGFR3 upregulation allows angiogenesis without VEGF-VEGFR2 signalling.
Nature 484, 110–114.
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
Berger, C., Harzer, H., Burkard, T.R., Steinmann, J., van der Horst, S.,
Laurenson, A.S., Novatchkova, M., Reichert, H., and Knoblich, J.A. (2012).
FACS purification and transcriptome analysis of Drosophila neural stem cells
reveals a role for Klumpfuss in self-renewal. Cell Rep. 2, 407–418.
Bernat, J.A., Crawford, G.E., Ogurtsov, A.Y., Collins, F.S., Ginsburg, D., and
Kondrashov, A.S. (2006). Distant conserved sequences flanking endothelial-
specific promoters contain tissue-specific DNase-hypersensitive sites and
over-represented motifs. Hum. Mol. Genet. 15, 2098–2105.
Bhasin, M., Yuan, L., Keskin, D.B., Otu, H.H., Libermann, T.A., and Oettgen, P.
(2010). Bioinformatic identification and characterization of human endothelial
cell-restricted genes. BMC Genomics 11, 342.
Bogdan, C. (2015). Nitric oxide synthase in innate and adaptive immunity: An
update. Trends Immunol. 36, 161–178.
Carmeliet, P., Lampugnani, M.G., Moons, L., Breviario, F., Compernolle, V.,
Bono, F., Balconi, G., Spagnuolo, R., Oosthuyse, B., Dewerchin, M., et al.
(1999). Targeted deficiency or cytosolic truncation of the VE-cadherin gene
in mice impairs VEGF-mediated endothelial survival and angiogenesis. Cell
98, 147–157.
Cheng, C., Haasdijk, R., Tempel, D., van de Kamp, E.H., Herpers, R., Bos, F.,
Den Dekker, W.K., Blonden, L.A., de Jong, R., Burgisser, P.E., et al. (2012).
Endothelial cell-specific FGD5 involvement in vascular pruning defines neo-
vessel fate in mice. Circulation 125, 3142–3158.
Cheng, L., Zhang, S., MacLennan, G.T., Williamson, S.R., Davidson, D.D.,
Wang, M., Jones, T.D., Lopez-Beltran, A., and Montironi, R. (2013). Laser-as-
sisted microdissection in translational research: Theory, technical consider-
ations, and future applications. Appl. Immunohistochem. Mol. Morphol. 21,
31–47.
Chi, J.T., Chang, H.Y., Haraldsen, G., Jahnsen, F.L., Troyanskaya, O.G.,
Chang, D.S., Wang, Z., Rockson, S.G., van de Rijn, M., Botstein, D., and
Brown, P.O. (2003). Endothelial cell diversity revealed by global expression
profiling. Proc. Natl. Acad. Sci. USA 100, 10623–10628.
Chu, T.J., and Peters, D.G. (2008). Serial analysis of the vascular endothelial
transcriptome under static and shear stress conditions. Physiol. Genomics
34, 185–192.
Civelek, M., Manduchi, E., Riley, R.J., Stoeckert, C.J., Jr., and Davies, P.F.
(2011). Coronary artery endothelial transcriptome in vivo: Identification of
endoplasmic reticulum stress and enhanced reactive oxygen species by
gene connectivity network analysis. Circ Cardiovasc Genet 4, 243–252.
Collins, T., Read, M.A., Neish, A.S., Whitley, M.Z., Thanos, D., and Maniatis, T.
(1995). Transcriptional regulation of endothelial cell adhesion molecules: NF-
kappa B and cytokine-inducible enhancers. FASEB J. 9, 899–909.
Conley, C.A. (2001). Leiomodin and tropomodulin in smooth muscle. Am. J.
Physiol. Cell Physiol. 280, C1645–C1656.
Cooke, B.M., Usami, S., Perry, I., and Nash, G.B. (1993). A simplified method
for culture of endothelial cells and analysis of adhesion of blood cells under
conditions of flow. Microvasc. Res. 45, 33–45.
Delia, D., Lampugnani, M.G., Resnati, M., Dejana, E., Aiello, A., Fontanella, E.,
Soligo, D., Pierotti, M.A., and Greaves, M.F. (1993). CD34 expression is regu-
lated reciprocally with adhesionmolecules in vascular endothelial cells in vitro.
Blood 81, 1001–1008.
Dıaz-Flores, L., Gutierrez, R., Madrid, J.F., Varela, H., Valladares, F., Acosta, E.,
Martın-Vasallo, P., and Dıaz-Flores, L., Jr. (2009). Pericytes.Morphofunction, in-
teractions and pathology in a quiescent and activated mesenchymal cell niche.
Histol. Histopathol. 24, 909–969.
Dreiza, C.M., Komalavilas, P., Furnish, E.J., Flynn, C.R., Sheller, M.R., Smoke,
C.C., Lopes, L.B., and Brophy, C.M. (2010). The small heat shock protein,
HSPB6, in muscle function and disease. Cell Stress Chaperones 15, 1–11.
Du Toit, A. (2015). Mechanotransduction: VE-cadherin lets it flow. Nat. Rev.
Mol. Cell Biol. 16, 268–268.
Durr, E., Yu, J., Krasinska, K.M., Carver, L.A., Yates, J.R., Testa, J.E., Oh, P.,
and Schnitzer, J.E. (2004). Direct proteomicmapping of the lungmicrovascular
endothelial cell surface in vivo and in cell culture. Nat. Biotechnol. 22, 985–992.
East, L., and Isacke, C.M. (2002). The mannose receptor family. Biochim.
Biophys. Acta 1572, 364–386.
Fabriek, B.O., van Bruggen, R., Deng, D.M., Ligtenberg, A.J., Nazmi, K.,
Schornagel, K., Vloet, R.P., Dijkstra, C.D., and van den Berg, T.K. (2009).
The macrophage scavenger receptor CD163 functions as an innate immune
sensor for bacteria. Blood 113, 887–892.
Flicek, P., Amode, M.R., Barrell, D., Beal, K., Billis, K., Brent, S., Carvalho-
Silva, D., Clapham, P., Coates, G., Fitzgerald, S., et al. (2014). Ensembl
2014. Nucleic Acids Res. 42, D749–D755.
Forstermann, U., and Sessa, W.C. (2012). Nitric oxide synthases: Regulation
and function. Eur. Heart J. 33, 829–837, 837a–837d.
Ganz, P., and Hsue, P.Y. (2013). Endothelial dysfunction in coronary heart dis-
ease is more than a systemic process. Eur. Heart J. 34, 2025–2027.
Gaujoux, R., and Seoighe, C. (2013). CellMix: A comprehensive toolbox for
gene expression deconvolution. Bioinformatics 29, 2211–2212.
Hauge, H., Patzke, S., and Aasheim, H.C. (2007). Characterization of the
FAM110 gene family. Genomics 90, 14–27.
Ho, M., Yang, E., Matcuk, G., Deng, D., Sampas, N., Tsalenko, A., Tabibiazar,
R., Zhang, Y., Chen, M., Talbi, S., et al. (2003). Identification of endothelial cell
genes by combined database mining and microarray analysis. Physiol.
Genomics 13, 249–262.
Hobson, B., and Denekamp, J. (1984). Endothelial proliferation in tumours and
normal tissues: continuous labelling studies. Br J Cancer 49, 405–413.
Hu, G., Tang, J., Zhang, B., Lin, Y., Hanai, J., Galloway, J., Bedell, V., Bahary,
N., Han, Z., Ramchandran, R., et al. (2006). A novel endothelial-specific heat
shock protein HspA12B is required in both zebrafish development and endo-
thelial functions in vitro. J. Cell Sci. 119, 4117–4126.
Huminiecki, L., and Bicknell, R. (2000). In silico cloning of novel endothelial-
specific genes. Genome Res. 10, 1796–1806.
Ichikawa-Shindo, Y., Sakurai, T., Kamiyoshi, A., Kawate, H., Iinuma, N.,
Yoshizawa, T., Koyama, T., Fukuchi, J., Iimuro, S., Moriyama, N., et al.
(2008). The GPCR modulator protein RAMP2 is essential for angiogenesis
and vascular integrity. J Clin Invest. 118, 29–39.
Jaye, M., Lynch, K.J., Krawiec, J., Marchadier, D., Maugeais, C., Doan, K.,
South, V., Amin, D., Perrone, M., and Rader, D.J. (1999). A novel endothe-
lial-derived lipase that modulates HDL metabolism. Nat. Genet. 21, 424–428.
Ju, W., Greene, C.S., Eichinger, F., Nair, V., Hodgin, J.B., Bitzer, M., Lee, Y.S.,
Zhu, Q., Kehata, M., Li, M., et al. (2013). Defining cell-type specificity at the
transcriptional level in human disease. Genome Res. 23, 1862–1873.
Kampf, C., Olsson, I., Ryberg, U., SjStedt, E., and Ponten, F. (2012).
Production of tissue microarrays, immunohistochemistry staining and digitali-
zation within the human protein atlas. J. Vis. Exp. 63, http://dx.doi.org/10.
3791/3620.
Kanaji, S., Fahs, S.A., Shi, Q., Haberichter, S.L., andMontgomery, R.R. (2012).
Contribution of platelet vs. endothelial VWF to platelet adhesion and hemosta-
sis. J. Thromb. Haemost. 10, 1646–1652.
Kaufmann, A., Salentin, R., Gemsa, D., and Sprenger, H. (2001). Increase of
CCR1 and CCR5 expression and enhanced functional response to MIP-1 alpha
during differentiation of humanmonocytes tomacrophages. J. Leukoc. Biol. 69,
248–252.
Kim, D., Pertea, G., Trapnell, C., Pimentel, H., Kelley, R., and Salzberg, S.L.
(2013). TopHat2: Accurate alignment of transcriptomes in the presence of in-
sertions, deletions and gene fusions. Genome Biol. 14, R36.
Kim, C., Yang, H., Fukushima, Y., Saw, P.E., Lee, J., Park, J.S., Park, I., Jung,
J., Kataoka, H., Lee, D., et al. (2014). Vascular RhoJ is an effective and selec-
tive target for tumor angiogenesis and vascular disruption. Cancer Cell 25,
102–117.
Korhonen, J., Lahtinen, I., Halmekyto, M., Alhonen, L., Janne, J., Dumont, D.,
and Alitalo, K. (1995). Endothelial-specific gene expression directed by the tie
gene promoter in vivo. Blood 86, 1828–1835.
Kunjathoor, V.V., Febbraio, M., Podrez, E.A., Moore, K.J., Andersson, L.,
Koehn, S., Rhee, J.S., Silverstein, R., Hoff, H.F., and Freeman, M.W. (2002).
Scavenger receptors class A-I/II and CD36 are the principal receptors respon-
sible for the uptake of modified low density lipoprotein leading to lipid loading
in macrophages. J. Biol. Chem. 277, 49982–49988.
Cell Systems 3, 1–15, September 28, 2016 13
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
Lacorre, D.A., Baekkevold, E.S., Garrido, I., Brandtzaeg, P., Haraldsen, G.,
Amalric, F., and Girard, J.P. (2004). Plasticity of endothelial cells: Rapid dedif-
ferentiation of freshly isolated high endothelial venule endothelial cells outside
the lymphoid tissue microenvironment. Blood 103, 4164–4172.
Lenting, P.J., Christophe, O.D., and Denis, C.V. (2015). von Willebrand factor
biosynthesis, secretion, and clearance: Connecting the far ends. Blood 125,
2019–2028.
Leong, K.G., Hu, X., Li, L., Noseda, M., Larrivee, B., Hull, C., Hood, L., Wong,
F., and Karsan, A. (2002). Activated Notch4 inhibits angiogenesis: role of beta
1-integrin activation. Mol Cell Biol. 22, 2830–2841.
Ley, K. (2003). The role of selectins in inflammation and disease. Trends Mol.
Med. 9, 263–268.
Li, S., Zhou, X.L., Dang, Y.Y., Kwan, Y.W., Chan, S.W., Leung, G.P.H., Lee,
S.M.Y., and Hoi, M.P.M. (2015). Basal Flt1 tyrosine kinase activity is a positive
regulator of endothelial survival and vascularization during zebrafish embryo-
genesis. Biochim. Biophys. Acta 1850, 373–384.
Liang, Y., and Tedder, T.F. (2001). Identification of a CD20-, FcepsilonRIbeta-,
and HTm4-related gene family: Sixteen newMS4A family members expressed
in human and mouse. Genomics 72, 119–127.
Liu, Z., Fan, F., Wang, A., Zheng, S., and Lu, Y. (2014). Dll4-Notch signaling in
regulation of tumor angiogenesis. J. Cancer Res. Clin. Oncol. 140, 525–536.
Long, X., Tharp, D.L., Georger, M.A., Slivano, O.J., Lee, M.Y., Wamhoff, B.R.,
Bowles, D.K., and Miano, J.M. (2009). The smooth muscle cell-restricted
KCNMB1 ion channel subunit is a direct transcriptional target of serum
response factor and myocardin. J. Biol. Chem. 284, 33671–33682.
Lorenzon, E., Colladel, R., Andreuzzi, E., Marastoni, S., Todaro, F.,
Schiappacassi, M., Ligresti, G., Colombatti, A., and Mongiat, M. (2012).
MULTIMERIN2 impairs tumor angiogenesis and growth by interfering with
VEGF-A/VEGFR2 pathway. Oncogene 31, 3136–3147.
Mackman, N. (2012). New insights into themechanisms of venous thrombosis.
J. Clin. Invest. 122, 2331–2336.
Malatesta, M. (2016). Histological and histochemical methods - theory and
practice. Eur. J. Histochem 60, 2639.
Masiero, M., Simoes, F.C., Han, H.D., Snell, C., Peterkin, T., Bridges, E.,
Mangala, L.S.,Wu, S.Y., Pradeep, S., Li, D., et al. (2013). A core human primary
tumor angiogenesis signature identifies the endothelial orphan receptor
ELTD1 as a key regulator of angiogenesis. Cancer Cell 24, 229–241.
Mi, H., Muruganujan, A., Casagrande, J.T., and Thomas, P.D. (2013). Large-
scale gene function analysis with the PANTHER classification system. Nat.
Protoc. 8, 1551–1566.
Mi, H., Poudel, S., Muruganujan, A., Casagrande, J.T., and Thomas, P.D.
(2016). PANTHER version 10: Expanded protein families and functions, and
analysis tools. Nucleic Acids Res. 44, D336–D342.
Miwa, T., Manabe, Y., Kurokawa, K., Kamada, S., Kanda, N., Bruns, G.,
Ueyama, H., and Kakunaga, T. (1991). Structure, chromosome location, and
expression of the human smooth muscle (enteric type) gamma-actin gene:
Evolution of six human actin genes. Mol. Cell. Biol. 11, 3296–3306.
Moren, B., Shah, C., Howes, M.T., Schieber, N.L., McMahon, H.T., Parton,
R.G., Daumke, O., and Lundmark, R. (2012). EHD2 regulates caveolar dy-
namics via ATP-driven targeting and oligomerization. Mol. Biol. Cell 23,
1316–1329.
Muller, A.M., Hermanns, M.I., Skrzynski, C., Nesslinger, M., Muller, K.M., and
Kirkpatrick, C.J. (2002). Expression of the endothelial markers PECAM-1, vWf,
and CD34 in vivo and in vitro. Exp. Mol. Pathol. 72, 221–229.
Mura, M., Swain, R.K., Zhuang, X., Vorschmitt, H., Reynolds, G., Durant, S.,
Beesley, J.F., Herbert, J.M., Sheldon, H., Andre, M., et al. (2012).
Identification and angiogenic role of the novel tumor endothelial marker
CLEC14A. Oncogene 31, 293–305.
Murray, P.J., and Wynn, T.A. (2011). Protective and pathogenic functions of
macrophage subsets. Nat. Rev. Immunol. 11, 723–737.
Nanda, N., Bao,M., Lin, H., Clauser, K., Komuves, L., Quertermous, T., Conley,
P.B., Phillips, D.R., and Hart, M.J. (2005). Platelet endothelial aggregation re-
ceptor 1 (PEAR1), a novel epidermal growth factor repeat-containing trans-
14 Cell Systems 3, 1–15, September 28, 2016
membrane receptor, participates in platelet contact-induced activation.
J. Biol. Chem. 280, 24680–24689.
Nolan, D.J., Ginsberg, M., Israely, E., Palikuqi, B., Poulos, M.G., James, D.,
Ding, B.S., Schachterle, W., Liu, Y., Rosenwaks, Z., et al. (2013). Molecular sig-
natures of tissue-specificmicrovascular endothelial cell heterogeneity in organ
maintenance and regeneration. Dev. Cell 26, 204–219.
Noy, P.J., Lodhia, P., Khan, K., Zhuang, X., Ward, D.G., Verissimo, A.R.,
Bacon, A., and Bicknell, R. (2015). Blocking CLEC14A-MMRN2 binding in-
hibits sprouting angiogenesis and tumour growth. Oncogene 34, 5821–5831.
Oliveira-Paula, G.H., Lacchini, R., and Tanus-Santos, J.E. (2016). Endothelial
nitric oxide synthase: From biochemistry and gene structure to clinical implica-
tions of NOS3 polymorphisms. Gene 575, 584–599.
Pei-Ling Chiu, A., Wang, F., Lal, N., Wang, Y., Zhang, D., Hussein, B., Wan, A.,
Vlodavsky, I., and Rodrigues, B. (2014). Endothelial cells respond to hypergly-
cemia by increasing the LPL transporter GPIHBP1. Am. J. Physiol. Endocrinol.
Metab. 306, E1274–E1283.
Pober, J.S., and Sessa, W.C. (2007). Evolving functions of endothelial cells in
inflammation. Nat. Rev. Immunol. 7, 803–815.
Ponten, F., Jirstrom, K., and Uhlen, M. (2008). The Human Protein Atlas—a tool
for pathology. J. Pathol. 216, 387–393.
Pusztaszeri,M.P.,Seelentag,W.,andBosman,F.T. (2006). Immunohistochemical
expression of endothelialmarkersCD31,CD34, vonWillebrand factor, andFli-1 in
normal human tissues. J. Histochem. Cytochem. 54, 385–395.
Rensen, S.S., Doevendans, P.A., and van Eys, G.J. (2007). Regulation and
characteristics of vascular smooth muscle cell phenotypic diversity. Neth.
Heart J. 15, 100–108.
Rho, S.S., Choi, H.J., Min, J.K., Lee, H.W., Park, H., Park, H., Kim, Y.M., and
Kwon, Y.G. (2011). Clec14a is specifically expressed in endothelial cells and
mediates cell to cell adhesion. Biochem. Biophys. Res. Commun. 404,
103–108.
Satterthwaite, A.B., Burn, T.C., Le Beau, M.M., and Tenen, D.G. (1992).
Structure of the gene encoding CD34, a human hematopoietic stem cell anti-
gen. Genomics 12, 788–794.
Sawada, J., Urakami, T., Li, F., Urakami, A., Zhu, W., Fukuda, M., Li, D.Y.,
Ruoslahti, E., and Komatsu, M. (2012). Small GTPase R-Ras regulates integrity
and functionality of tumor blood vessels. Cancer Cell 22, 235–249.
Schick, P.K., Walker, J., Profeta, B., Denisova, L., and Bennett, V. (1997).
Synthesis and secretion of von Willebrand factor and fibronectin in megakar-
yocytes at different phases of maturation. Arterioscler. Thromb. Vasc. Biol. 17,
797–801.
Schwanhausser, B., Busse, D., Li, N., Dittmar, G., Schuchhardt, J., Wolf, J.,
Chen, W., and Selbach, M. (2011). Global quantification of mammalian gene
expression control. Nature 473, 337–342.
Seaman, S., Stevens, J., Yang, M.Y., Logsdon, D., Graff-Cherry, C., and St
Croix, B. (2007). Genes that distinguish physiological and pathological angio-
genesis. Cancer Cell 11, 539–554.
Seano, G., Chiaverina, G., Gagliardi, P., di Blasio, L., Puliafito, A., Bouvard, C.,
Sessa, R., Tarone, G., Sorokin, L., Helley, D., et al. (2014). Endothelial podo-
some rosettes regulate vascular branching in tumor angiogenesis. Nat. Biol.
Cell 16, 931–941.
Semenza, G.L. (2001). Regulation of hypoxia-induced angiogenesis: A chap-
erone escorts VEGF to the dance. J. Clin. Invest. 108, 39–40.
Shen-Orr, S.S., Tibshirani, R., Khatri, P., Bodian, D.L., Staedtler, F., Perry,
N.M., Hastie, T., Sarwal, M.M., Davis, M.M., and Butte, A.J. (2010). Cell
type-specific gene expression differences in complex tissues. Nat. Methods
7, 287–289.
Shih, S.C., Robinson, G.S., Perruzzi, C.A., Calvo, A., Desai, K., Green, J.E., Ali,
I.U., Smith, L.E., and Senger, D.R. (2002). Molecular profiling of angiogenesis
markers. Am. J. Pathol. 161, 35–41.
Spies, D., andCiaudo, C. (2015). Dynamics in transcriptomics: Advancements in
RNA-seq time course and downstream analysis. Comput. Struct. Biotechnol. J.
13, 469–477.
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
Steagall, R.J., Rusinol, A.E., Truong, Q.A., and Han, Z. (2006). HSPA12B is pre-
dominantly expressed in endothelial cells and required for angiogenesis.
Arterioscler. Thromb. Vasc. Biol. 26, 2012–2018.
Steyers, C.M., 3rd, and Miller, F.J., Jr. (2014). Endothelial dysfunction in
chronic inflammatory diseases. Int. J. Mol. Sci. 15, 11324–11349.
Stoeber, M., Stoeck, I.K., Hanni, C., Bleck, C.K.E., Balistreri, G., and Helenius,
A. (2012). Oligomers of the ATPase EHD2 confine caveolae to the plasma
membrane through association with actin. EMBO J. 31, 2350–2364.
Tabas, I., Garcıa-Cardena, G., and Owens, G.K. (2015). Recent insights into
the cellular biology of atherosclerosis. J. Cell Biol. 209, 13–22.
Trapnell, C., Williams, B.A., Pertea, G., Mortazavi, A., Kwan, G., van Baren,
M.J., Salzberg, S.L., Wold, B.J., and Pachter, L. (2010). Transcript assembly
and quantification by RNA-Seq reveals unannotated transcripts and isoform
switching during cell differentiation. Nat. Biotechnol. 28, 511–515.
Uhlen, M., Fagerberg, L., Hallstrom, B.M., Lindskog, C., Oksvold, P.,
Mardinoglu, A., Sivertsson, A., Kampf, C., Sjostedt, E., Asplund, A., et al.
(2015). Proteomics. Tissue-based map of the human proteome. Science
347, 1260419.
Urich, E., Lazic, S.E., Molnos, J., Wells, I., and Freskgard, P.O. (2012).
Transcriptional profiling of human brain endothelial cells reveals key properties
crucial for predictive in vitro blood-brain barrier models. PLoS ONE 7, e38149.
van Beijnum, J.R., Dings, R.P., van der Linden, E., Zwaans, B.M., Ramaekers,
F.C., Mayo, K.H., and Griffioen, A.W. (2006). Gene expression of tumor angio-
genesis dissected: Specific targeting of colon cancer angiogenic vasculature.
Blood 108, 2339–2348.
Varchetta, S., Brunetta, E., Roberto, A., Mikulak, J., Hudspeth, K.L., Mondelli,
M.U., and Mavilio, D. (2012). Engagement of Siglec-7 receptor induces a pro-
inflammatory response selectively in monocytes. PLoS ONE 7, e45821.
Vita, J.A. (2011). Endothelial function. Circulation 124, e906–e912.
Vogel, C., and Marcotte, E.M. (2012). Insights into the regulation of protein
abundance from proteomic and transcriptomic analyses. Nat. Rev. Genet.
13, 227–232.
Vogel, C., Abreu, Rde.S., Ko, D., Le, S.Y., Shapiro, B.A., Burns, S.C., Sandhu,
D., Boutz, D.R., Marcotte, E.M., and Penalva, L.O. (2010). Sequence signa-
tures and mRNA concentration can explain two-thirds of protein abundance
variation in a human cell line. Mol. Syst. Biol. 6, 400.
Wang, Y., and Navin, N.E. (2015). Advances and applications of single-cell
sequencing technologies. Mol. Cell 58, 598–609.
Wang, Z., Wang, D.Z., Pipes, G.C., and Olson, E.N. (2003). Myocardin is a
master regulator of smooth muscle gene expression. Proc. Natl. Acad. Sci.
USA 100, 7129–7134.
Wu, W., Xiao, H., Laguna-Fernandez, A., Villarreal, G., Jr., Wang, K.C., Geary,
G.G., Zhang, Y., Wang, W.C., Huang, H.D., Zhou, J., et al. (2011). Flow-depen-
dent regulation of Kruppel-like factor 2 is mediated by MicroRNA-92a.
Circulation 124, 633–641.
Yamawaki, K., Ito,M., Machida, H.,Moriki, N., Okamoto, R., Isaka, N., Shimpo,
H., Kohda, A., Okumura, K., Hartshorne, D.J., and Nakano, T. (2001).
Identification of human CPI-17, an inhibitory phosphoprotein for myosin phos-
phatase. Biochem. Biophys. Res. Commun. 285, 1040–1045.
Yates, A., Akanni, W., Amode, M.R., Barrell, D., Billis, K., Carvalho-Silva, D.,
Cummins, C., Clapham, P., Fitzgerald, S., Gil, L., et al. (2016). Ensembl
2016. Nucleic Acids Res. 44, D710–D716.
Zanetta, L., Marcus, S.G., Vasile, J., Dobryansky, M., Cohen, H., Eng, K.,
Shamamian, P., and Mignatti, P. (2000). Expression of Von Willebrand factor,
an endothelial cell marker, is up-regulated by angiogenesis factors: A potential
method for objective assessment of tumor angiogenesis. Int. J. Cancer 85,
281–288.
Zhang, Y., Chen, K., Guo, L., and Wu, C. (2002). Characterization of PINCH-2,
a new focal adhesion protein that regulates the PINCH-1-ILK interaction, cell
spreading, and migration. J. Biol. Chem. 277, 38328–38338.
Zhou, L., and Zhu, D.Y. (2009). Neuronal nitric oxide synthase: Structure, sub-
cellular localization, regulation, and clinical implications. Nitric Oxide: Biol.
Chem. 20, 223–230.
Cell Systems 3, 1–15, September 28, 2016 15
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
STAR+METHODS
KEY RESOURCES TABLE
REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies
CLEC14A Atlas Antibodies HPA039468
vWF Atlas Antibodies Cat# HPA001815, RRID:AB_611880
CD34 Atlas Antibodies HPA036722
HSPA12B Atlas Antibodies Cat# HPA013659, RRID:AB_1234541
ESM1 Sigma-Aldrich Cat# HPA036660, RRID:AB_10670842
PTPRC Atlas Antibodies Cat# HPA000440, RRID:AB_611377
ITGA2B Atlas Antibodies Cat# HPA031168, RRID:AB_10664706
MYH11 Atlas Antibodies Cat# HPA014539, RRID:AB_1234906
EDH2 Atlas Antibodies HPA049890
LIMS2 Atlas Antibodies HPA058340
FAM110D Atlas Antibodies Cat# HPA013664, RRID:AB_1234332
KANK3 Atlas Antibodies HPA051153
GIPC3 Atlas Antibodies HPA061258
ENG Leica Microsystems Cat# NCL-CD105, RRID:AB_563482
Deposited Data
HUVEC sequencing data This paper ArrayExpress E-MTAB-4897
Human tissue sequencing data Uhlen et al., 2015 ArrayExpress E-MTAB-2836
Other
Human Protein Atlas resource Ponten et al., 2008
Uhlen et al., 2015
http://www.proteinatlas.org/
CONTACT FOR REAGENT AND RESOURCE SHARING
Further information and requests for reagents may be directed to, and will be fulfilled by the corresponding author, Dr. Lynn Marie
Butler ([email protected]).
EXPERIMENTAL MODEL AND SUBJECT DETAILS
Human tissue transcript profiling was performed in house as part of the Human Protein Atlas (HPA) project (Ponten et al., 2008; Uhlen
et al., 2015) (http://www.proteinatlas.org/). 124 individual human tissue samples were collected from 32 different organs (details in
Table S1) obtained from the Department of Pathology, Uppsala University Hospital, Uppsala, Sweden; as part of the Uppsala Bio-
bank. Samples were handled in accordance with Swedish laws and regulations, with approval and advisory reports from the Uppsala
Ethical Review Board (Uhlen et al., 2015).
METHOD DETAILS
Human Tissue Preparation and Transcript ProfilingTissue samples were embedded in optimal cutting temperature compound and stored at�80�C. Hematoxylin and eosin (HE) stained
frozen sections (4 mm) were prepared from each sample and examined by a pathologist to confirm sampling of representative normal
tissue. Three sections per sample were homogenized using a 3mm metal grinding ball (VWR) and total RNA was extracted using
the RNeasy Mini Kit (QIAGEN), according to the manufacturer’s instructions. Extracted RNA was analyzed using either an Experion
automated electrophoresis system (BioRad Laboratories) with the standard-sensitivity RNA chip or an Agilent 2100 Bio-analyzer
system (Agilent Biotechnologies) with the RNA 6000 Nano Labchip Kit. Only high quality RNA (RNA integrity number R 7.5) was
used for library preparation (PolyA) and sequencing. Next generation RNA sequencing was performed using Illumina Hiseq2000
and Hiseq2500 and the standard Illumina RNA-seq protocol with a paired end read length of 100x2 bp or 125x2 bp with on average
50M reads/library (span of 13-84 M reads). Processed reads were mapped to the Human Genome (GRCh37 and GRCH38) using
Tophat v2.0.8b (Kim et al., 2013), allowing for two mismatches. Transcript abundance FPKM (fragments per kilobase of exon model
e1 Cell Systems 3, 1–15.e1–e3, September 28, 2016
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
per million mapped reads) values were calculated using Cufflinks v2.1.2 (Trapnell et al., 2010) and Ensembl build 75 (Flicek et al.,
2014) or Ensembl build 83 (Yates et al., 2016) using summarized gene FPKM, not accounting for different isoforms in the analysis.
The number of protein coding genes mapped was 20,344.
Estimation of Endothelial Cell Percentage in Selected TissuesThe percentage of EC was estimated in fresh frozen tissue samples from bone marrow, pancreas, ovary, tonsil, salivary gland, ap-
pendix, spleen, thyroid gland, gall bladder, urinary bladder, heart muscle and lung (2-5 individual samples/organ; see Table S1) that
were used for RNA extraction andmRNA sequencing. A cryosection was stained using hematoxylin-eosin (H&E) and a pathologist or
a trained technician identified the constituent EC cells using high power microscopy using established morphological criteria (Ma-
latesta, 2016), specifically an elongated cell nuclei surrounding clear vascular spaces or slits. The mean percentage was estimated
from at least 4 representative fields from each individual tissue sample.
Tissue Profiling: Human Tissue SectionsTissue microarrays (TMA) were generated and stained as part of the HPA project, as previously described (Kampf et al., 2012; Ponten
et al., 2008). Briefly, formalin fixed and paraffin embedded tissue sampleswere sectioned, de-paraffinised in xylene, hydrated in graded
alcohols and blocked for endogenous peroxidase in 0.3% hydrogen peroxide diluted in 95% ethanol. For antigen retrieval, a Decloak-
ing chamber (Biocare Medical, CA) was used. Slides were boiled in Citrate buffer, pH6 (Lab Vision, CA). Primary antibody against
CLEC14A (Atlas Antibodies HPA039468), VWF (Atlas Antibodies HPA001815), CD34 (Atlas Antibodies HPA036722), HSPA12B (Atlas
Antibodies HPA013659), ENG (Novocastra NCL-CD105), ESM1 (Atlas Antibodies HPA036660), PTPRC (Atlas Antibodies HPA000440),
ITGA2B (Atlas AntibodiesHPA031168),MYH11 (Atlas AntibodiesHPA0145359), EDH2 (AtlasAntibodiesHPA049890), LIMS2 (AtlasAn-
tibodies HPA058340), FAM110D (Atlas Antibodies HPA013664), KANK3 (Atlas Antibodies HPA051153) or GIPC3 (Atlas Antibodies
HPA061258) and a dextran polymer visualization system (UltraVision LP HRP polymer, Lab Vision) were incubated for 30 min each
at room temperature and slides were developed for 10 min using Diaminobenzidine (Lab Vision) as the chromogen. Slides were coun-
terstained in Mayers hematoxylin (Histolab) and scanned using Scanscope XT (Aperio).
Transcript Profiling: Isolated Human Endothelial CellsHuman umbilical vein endothelial cells (HUVEC) were isolated from umbilical cords from four different donors, as previously described
(Cooke et al., 1993). Cells were maintained in Medium 199 (M199, Invitrogen) containing 20% fetal calf serum, 28 mg/ml gentamycin,
2.5 mg/ml amphotericin B, 1ng/ml epidermal growth factor and 1 mg/ml hydrocortisone (all from Sigma) for 48 hr prior to processing.
HUVEC cultures isolated using this method were 96%–98% pure, determined by positive staining by flow cytometry of CD105,
CD31 and vWF and the expression of elevated levels of intracellular adhesion molecule (ICAM-1) and E-selectin following stimulation
with the inflammatory cytokine interleukin-1b. Total HUVEC RNA was isolated using the RNeasy mini kit with QIAshredder (QIAGEN)
according to the manufacturer’s instructions. RNA integrity number was > 8.0 for all samples. RNA sequencing was performed using
the standard Illumina RNA-seq protocol. FPKM (fragments per kilobase of exon model per million mapped reads) values were calcu-
lated using Cufflinks v2.1.2 (Trapnell et al., 2010) and Ensembl build 75 (Flicek et al., 2014). The number of protein coding genes map-
pedwas 20,073. Normalizedmicroarray gene expression datasets for human bladder microvascular EC; HBMEC (GSM72644), human
iliac artery EC; HIAEC (GSM72657, GSM72658, GSM72659, GSM72660), human saphenous vein EC; HSVEC (GSM72683,
GSM72683), human umbilical artery EC; HUAEC (GSM72686, GSM72687, GSM72688, GSM72689, GSM72690, GSM72691) and hu-
man uterinemicrovascular EC; HUtMEC (GSM72692, GSM72692) were derived from a public dataset of 61 different normal human cell
cultures (GSE3239, GE Codelink Human Uniset) downloaded from NCBI-GEO (http://www.ncbi.nlm.nih.gov/geo/).
QUANTIFICATION AND STATISTICAL ANALYSIS
Analysis of RNA-Seq Data to Determine Pan EC-Enriched TranscriptsAs EC are present in all human tissues, at differing levels, we used a correlation analysis method to identify EC-enriched gene tran-
scripts from the whole tissue RNA-seq data described above.We calculated the pairwise Spearman correlation coefficients between
the EC transcripts C-type lectin domain family 14, member A (CLEC14A), von Willebrand factor (VWF) and CD34 (CD34) and the
20,073mapped protein coding genes. Amean Spearman correlation coefficient value between the ‘test’ transcripts and the EC refer-
ence transcripts of 0.5 or above was considered a positive result. All statistical analyses were performed in R (version 3.1.1). Corre-
lation values were calculated using the cor() function with method = ’’spearman’’ and use = ’’complete’’. Linear regression was per-
formed using the lm() function with default parameters. Multiple comparison correction of p values was done with p.adjust() using
both method = ’’fdr’’ and method = ’’bonferroni’’. We measured mean correlation coefficient values between three selected SMC
reference transcripts Myosin, Heavy Chain 11, Smooth Muscle (MYH11), Myosin Light Chain Kinase (MYLK) and Actin, Alpha 2,
Smooth Muscle, Aorta (ACTA2) and the identified EC-enriched transcripts. Those with higher correlation values than with the EC
reference transcripts were excluded.
Analysis of GTEx RNA-Seq Data as ReplicationThe gene RPKM and sample attributes tables of GTEx version 6 (dbGaP Accession phs000424.v6.p1) were downloaded from the
GTEx portal (http://www.gtexportal.org/home/). Data of samples originating from the tissues listed in Table S2, tab 6 were extracted
Cell Systems 3, 1–15.e1–e3, September 28, 2016 e2
Please cite this article in press as: Butler et al., Analysis of Body-wide Unfractionated Tissue Data to Identify a Core Human Endothelial Transcriptome,Cell Systems (2016), http://dx.doi.org/10.1016/j.cels.2016.08.001
from the expression matrix and pairwise Spearman correlation coefficients between the EC transcripts CLEC14A, VWF and CD34
were performed on these data using the previously described methods. A mean correlation coefficient R 0.5 between the EC-en-
riched transcripts identified using the HPA material and the EC reference transcripts in the GTEX material was considered a positive
replication result. A summary of the complete selection protocol is shown in Figure S3D.
Gene Ontology (GO) Enrichment AnalysisThe Gene Ontology Consortium (Ashburner et al., 2000) and PANTHER classification resource (Mi et al., 2013; Mi et al., 2016) were
used to identify over represented terms (biological processes) in the final panel of identified EC-enriched transcripts from the GO
ontology database (release date March 2016).
DATA AND SOFTWARE AVAILABILITY
HUVEC sequencing data has been deposited in ArrayExpress under accession number E-MTAB-4897.
ADDITIONAL RESOURCES
The Human Protein Atlas (HPA) website contains details of all sequencing data and antibody-based protein profiling used in this
study: http://www.proteinatlas.org/
e3 Cell Systems 3, 1–15.e1–e3, September 28, 2016