ResultsIntroduction The extracellular matrix (ECM) is composed of a variety of proteins secreted by...

1
Results Introduction The extracellular matrix (ECM) is composed of a variety of proteins secreted by the cell and self-organized into a complex mesh of fibres and soluble components. These materials are capable of forming a diverse set of structures (e.g. bone, blood vessels). A number of biological processes are influenced by the surrounding matrices including cell adhesion, migration, proliferation and differentiation. Changes in the structure and function of ECM components are known to be associated with a number of complex and diverse diseases such as arthritis, atherosclerosis and cancer [3]. Although there has been considerable growth in databases concerned with metabolic proteins, kinases or other signaling elements, very little attention has been devoted to structural proteins and the role of network connectivity in their self-organization. Materials and methods Acknowledgments Thanks to my supervisory committee: Johanna Rommens, Andrew Emili, Gary Bader and my supervisor, John Parkinson, for advice and encouragement. James Wasmuth, David He and members of the Parkinson Lab for copious amounts of tea, advice and discussions. Funding for this project was generously provided by the Heart and Stroke Foundation. Conclusions We have created the first interaction map of the extracellular matrix and estimate that the catalogue of human ECM proteins and their interactors may exceed 2500 proteins. If current estimates for the number of genes in the human genome are true (approximately 30,000) this implies that approaching 10% of the human genome is dedicated to dealing with extracellular organization. This level of complexity goes well beyond typical conceptual representations of the extracellular matrix (e.g. Fig. 4) and justifies comprehensive analysis of this system. Our efforts demonstrate that the observed lack of attention paid to structural proteins in databases in general also extends to GO annotations. Consequently, additional terms will need to be included to capture all of the known ECM components including all related biological process and molecular function terms. Mapping of annotated proteins from mouse and rat will aid considerably in addressing the surprisingly incomplete human annotations. We found that the total number of rat proteins annotated as ECM components was 2402 (as compared to 1682 for humans). Since 40% of the ECM proteins we identified so far have no known interactions in BioGRID there is considerable opportunity to extend the network by examining additional datasets (e.g. MINT, Intact, BIND, DIP, HPRD) for which there appear to be only minimal overlap [1]. Future Work Enlarge the human ECM map based on orthology and an expanded list of GO terms. Include interaction data from additional databases. Provide a detailed functional annotation of the resulting network. Construct ECM networks of other metazoans as a basis for determining adaptation and evolutionary conservation. Assembling the interactome of human extracellular matrix to understand its role in health and disease Graham L. Cromar and John Parkinson Program in Molecular Structure and Function, Hospital for Sick Children, Toronto, Ontario For further information Please contact [email protected]. A copy of this poster as well as more information on this and related projects can be obtained at www.compsysbio.org/lab/. 4. Shannon et al. 2003. Genome Res 11:2498–2504. World Wide Web URL: http://www.cytoscape.org/ 5. Stark et al. 2006. Nuc Acids Res 34:535-539. World Wide Web URL: http://www.thebiogrid.org/ 6. The Gene Ontology Consortium. 2000. Gene Ontology: tool for the unification of biology. Nature Genet. 25: 25-29 {accessed March 2007}. World Wide Web URL: http://wiki.geneontology.org/ Literature cited 1. Cesareni et al. 2005. FEBS Lett. 579(8):1828-1833. 2. Nielsen 2001. Animal Evolution. Second ed. Oxford University Press. 3. Online Mendelian Inheritance in Man, OMIM (TM). Johns Hopkins University, Baltimore, MD. MIM Number: {#123700}: {4/19/2006}: . World Wide Web URL: http://www.ncbi.nlm.nih.gov/omim/ Figure 4. Typical representations of the extracellular matrix, such as this one, include perhaps a dozen components which grossly under-represent the true complexity of this system. Based on our findings we estimate that approaching 10% of the 30,000 genes in the human proteome may be involved in extracellular organization. Image from: www.e22.physik.tu-muenchen.de/bausch/Oli_ECM.html ELN Our initial network representing human ECM proteins and their interactors consists of 361 nodes and 547 edges (inset top right). There remain 61 proteins, identified as matrix components based on Gene Ontology (GO) for which no known interactions were present in BioGRID (40%). Figure 3 (Inset top right): A physical protein-protein interaction map of the human extracellular matrix based on interactions from curated literature sources deposited in BioGRID [5]. A list of ECM proteins was derived from Gene Ontology [6] (all nodes shown in blue). Interactors resulting from the BioGRID search are shown in yellow. (Main figure): A sub-network showing elastin (ELN) and its nearest neighbours. Many of the interactors are known ECM proteins that should have been picked up in the initial search of the GO data. (Inset left): The ECM network appears to be rooted in core structural components such as various collagens featured in the sub-network shown here. Comparing the ECM of several metazoans (Fig. 1) allows us to explore the evolution of self- organization and its normal role in the development and maintenance of multi- cellularity. Evol-utionary conservation, for instance, can identify functionally important network components. A proper understanding of such functions, will shed light on the ECM’s role in health and disease. Sponge Hydra Worm Fly Human,Mouse,Fish Figure 2. ECM interactions were derived by filtering The Gene Ontology [6] and cross-referencing to BioGRID [5]. Cytoscape [4] was used to render the network. Careful examination of sub-graphs such as that of elastin (ELN) and its nearest neighbours (Fig. 3 main) demonstrates that many of the interactors identified from the BioGRID dataset are known ECM components missed in the initial GO search due to incomplete annotation of these proteins in the Gene Ontology. It is apparent that the corresponding orthologues in rat and mouse are much more completely annotated (data not shown). A subsequent attempt to pull down proteins matching all possible cellular component, biological process and molecular function terms associated with the extracellular matrix shows that the ECM graph can be expanded to at least 1682 nodes. The network appears to be rooted to core structural components, key amongst these are various collagens which are either adjacent, or interconnected by short path lengths (Fig. 3 inset left). The Gene Ontology (GO) project [6] addresses the need for a consistent vocabulary in describing biological processes, cellular components and molecular functions associated with gene products. We derived a list of ECM proteins matching cellular component terms: extracellular matrix part, middle lamella-containing extracellular matrix and, proteinaceous extracellular matrix. These proteins were cross- referenced in BioGRID [5], a database containing over 116,000 literature-curated interactions. The network was rendered in Cytoscape [4]. Figure 1. A phylogeny derived primarily from morphological features (after [2]) emphasizing the common names of some organisms we hope to include in our study.

Transcript of ResultsIntroduction The extracellular matrix (ECM) is composed of a variety of proteins secreted by...

Page 1: ResultsIntroduction The extracellular matrix (ECM) is composed of a variety of proteins secreted by the cell and self-organized into a complex mesh of.

ResultsIntroduction

The extracellular matrix (ECM) is composed of a variety of proteins secreted by the cell and self-organized into a complex mesh of fibres and soluble components. These materials are capable of forming a diverse set of structures (e.g. bone, blood vessels). A number of biological processes are influenced by the surrounding matrices including cell adhesion, migration, proliferation and differentiation. Changes in the structure and function of ECM components are known to be associated with a number of complex and diverse diseases such as arthritis, atherosclerosis and cancer [3].

Although there has been considerable growth in databases concerned with metabolic proteins, kinases or other signaling elements, very little attention has been devoted to structural proteins and the role of network connectivity in their self-organization.

Materials and methods

AcknowledgmentsThanks to my supervisory committee: Johanna Rommens, Andrew Emili, Gary Bader and my supervisor, John Parkinson, for advice and encouragement. James Wasmuth, David He and members of the Parkinson Lab for copious amounts of tea, advice and discussions. Funding for this project was generously provided by the Heart and Stroke Foundation.

Conclusions

We have created the first interaction map of the extracellular matrix and estimate that the catalogue of human ECM proteins and their interactors may exceed 2500 proteins. If current estimates for the number of genes in the human genome are true (approximately 30,000) this implies that approaching 10% of the human genome is dedicated to dealing with extracellular organization. This level of complexity goes well beyond typical conceptual representations of the extracellular matrix (e.g. Fig. 4) and justifies comprehensive analysis of this system.

Our efforts demonstrate that the observed lack of attention paid to structural proteins in databases in general also extends to GO annotations. Consequently, additional terms will need to be included to capture all of the known ECM components including all related biological process and molecular function terms. Mapping of annotated proteins from mouse and rat will aid considerably in addressing the surprisingly incomplete human annotations. We found that the total number of rat proteins annotated as ECM components was 2402 (as compared to 1682 for humans).

Since 40% of the ECM proteins we identified so far have no known interactions in BioGRID there is considerable opportunity to extend the network by examining additional datasets (e.g. MINT, Intact, BIND, DIP, HPRD) for which there appear to be only minimal overlap [1].

Future Work

Enlarge the human ECM map based on orthology and an expanded list of GO terms. Include interaction data from additional databases. Provide a detailed functional annotation of the resulting network. Construct ECM networks of other metazoans as a basis for determining adaptation and evolutionary conservation.

Assembling the interactome of human extracellular matrix to understand its role in health and disease

Graham L. Cromar and John ParkinsonProgram in Molecular Structure and Function, Hospital for Sick Children, Toronto, Ontario

For further informationPlease contact [email protected]. A copy of this poster as well as more information on this and related projects can be obtained at www.compsysbio.org/lab/.

4. Shannon et al. 2003. Genome Res 11:2498–2504. World Wide Web URL: http://www.cytoscape.org/

5. Stark et al. 2006. Nuc Acids Res 34:535-539. World Wide Web URL: http://www.thebiogrid.org/

6. The Gene Ontology Consortium. 2000. Gene Ontology: tool for the unification of biology. Nature Genet. 25: 25-29 {accessed March 2007}. World Wide Web URL: http://wiki.geneontology.org/

Literature cited1. Cesareni et al. 2005. FEBS Lett. 579(8):1828-1833. 2. Nielsen 2001. Animal Evolution. Second ed. Oxford University Press. 3. Online Mendelian Inheritance in Man, OMIM (TM). Johns Hopkins University,

Baltimore, MD. MIM Number: {#123700}: {4/19/2006}: . World Wide Web URL: http://www.ncbi.nlm.nih.gov/omim/

Figure 4. Typical representations of the extracellular matrix, such as this one, include perhaps a dozen components which grossly under-represent the true complexity of this system. Based on our findings we estimate that approaching 10% of the 30,000 genes in the human proteome may be involved in extracellular organization. Image from: www.e22.physik.tu-muenchen.de/bausch/Oli_ECM.html

ELN

Our initial network representing human ECM proteins and their interactors consists of 361 nodes and 547 edges (inset top right). There remain 61 proteins, identified as matrix components based on Gene Ontology (GO) for which no known interactions were present in BioGRID (40%).

Figure 3 (Inset top right): A physical protein-protein interaction map of the human extracellular matrix based on interactions from curated literature sources deposited in BioGRID [5]. A list of ECM proteins was derived from Gene Ontology [6] (all nodes shown in blue). Interactors resulting from the BioGRID search are shown in yellow. (Main figure): A sub-network showing elastin (ELN) and its nearest neighbours. Many of the interactors are known ECM proteins that should have been picked up in the initial search of the GO data. (Inset left): The ECM network appears to be rooted in core structural components such as various collagens featured in the sub-network shown here.

Comparing the ECM of several metazoans (Fig. 1) allows us to explore the evolution of self-organization and its normal role in the development and maintenance of multi-cellularity. Evol-utionary conservation, for instance, can identify functionally important network components. A proper understanding of such functions, will shed light on the ECM’s role in health and disease.

SpongeHydra

Worm

Fly

Human,Mouse,Fish

Figure 2. ECM interactions were derived by filtering The Gene Ontology [6] and cross-referencing to BioGRID [5]. Cytoscape [4] was used to render the network.

Careful examination of sub-graphs such as that of elastin (ELN) and its nearest neighbours (Fig. 3 main) demonstrates that many of the interactors identified from the BioGRID dataset are known ECM components missed in the initial GO search due to incomplete annotation of these proteins in the Gene Ontology.

It is apparent that the corresponding orthologues in rat and mouse are much more completely annotated (data not shown). A subsequent attempt to pull down proteins matching all possible cellular component, biological process and molecular function terms associated with the extracellular matrix shows that the ECM graph can be expanded to at least 1682 nodes.

The network appears to be rooted to core structural components, key amongst these are various collagens which are either adjacent, or interconnected by short path lengths (Fig. 3 inset left).

The Gene Ontology (GO) project [6] addresses the need for a consistent vocabulary in describing biological processes, cellular components and molecular functions associated with gene products. We derived a list of ECM proteins matching cellular component terms: extracellular matrix part, middle lamella-containing extracellular matrix and, proteinaceous extracellular matrix. These proteins were cross-referenced in BioGRID [5], a database containing over 116,000 literature-curated interactions. The network was rendered in Cytoscape [4].

Figure 1. A phylogeny derived primarily from morphological features (after [2]) emphasizing the common names of some organisms we hope to include in our study.