Evolution of Interactions Involving Intrinsically...

46
ACTA UNIVERSITATIS UPSALIENSIS UPPSALA 2018 Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 1424 Evolution of Interactions Involving Intrinsically Disordered Proteins EMMA ÅBERG ISSN 1651-6206 ISBN 978-91-513-0226-3 urn:nbn:se:uu:diva-336083

Transcript of Evolution of Interactions Involving Intrinsically...

Page 1: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2018

Digital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Medicine 1424

Evolution of Interactions InvolvingIntrinsically Disordered Proteins

EMMA ÅBERG

ISSN 1651-6206ISBN 978-91-513-0226-3urn:nbn:se:uu:diva-336083

Page 2: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

Dissertation presented at Uppsala University to be publicly examined in B41, Biomedicinsktcentrum, Husargatan 3, Uppsala, Friday, 16 March 2018 at 09:15 for the degree of Doctor ofPhilosophy (Faculty of Medicine). The examination will be conducted in English. Facultyexaminer: Doctor Madan Babu (MRC Laboratory of Molecular Biology).

AbstractÅberg, E. 2018. Evolution of Interactions Involving Intrinsically Disordered Proteins. DigitalComprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 1424.44 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-513-0226-3.

This thesis describes the evolution of intrinsically disordered proteins and their interactionpartners. The work presented is a combination of phylogenetic analysis, ancestral reconstructionand biophysical characterization in order to examine the evolutionary trajectory of protein-protein interactions involving disorder. The intrinsically disordered domains, NCBD and CIDare both part of transcriptional co-regulating proteins. In evolution, NCBD existed before theemergence of CID and the most ancient domains display a low affinity complex with manyweak contacts and high degree of conformational heterogeneity. Later in evolution, when NCBDand CID co-exists, a few mutations have altered the interaction in a way that the affinity isincreased 25-fold and the conformational heterogeneity is decreased. In the same manner, theinteraction is further optimized in extant species, resulting in a high affinity complex with lesscontacts of higher strength and less conformational heterogeneity. The intrinsically disorderedtransactivation domain of the tumour suppressing protein p53 and its negative regulator MDM2date back to the beginning of animal life. The interacting domains are either lost or conservedin distinct phyla indicating a tight co-evolution. Phylogenetic trees produced by only includingphyla with a conserved interaction domain follow the species evolution. Resurrection of p53 andMDM2 in the vertebrate lineage display an evolution of a high affinity complex in the ancestorof fish and tetrapods to a slightly improved affinity in modern tetrapods but a substantiallylower affinity in zebrafish. The p53 protein family, which also includes p63 and p73, divergedfrom a common ancestor. The individual proteins display altered affinities to MDM2 which is aresult of the high sequence divergence between them. The ionic dependence for the interactionsis small, and not in line with other studies of disordered proteins. In conclusion, the work inthis thesis have contributed with evolutionary analysis and experimental data of interactionsinvolving intrinsically disordered proteins.

Keywords: intrinsically disordered proteins, protein evolution, phylogenetics, ancestralsequence reconstruction, biophysical characterization, NCBD, CID, p53, MDM2

Emma Åberg, Department of Medical Biochemistry and Microbiology, Box 582, UppsalaUniversity, SE-75123 Uppsala, Sweden.

© Emma Åberg 2018

ISSN 1651-6206ISBN 978-91-513-0226-3urn:nbn:se:uu:diva-336083 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-336083)

Page 3: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

To my ancestors, Mum and dad

Page 4: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan
Page 5: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

List of Papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals.

I Hultqvist G, Åberg E, Camilloni C, Sundell GN, Andersson E,

Dogan J, Chi CN, Vendruscolo M, Jemth P (2017). Emergence and evolution of an interaction between intrinsically disordered proteins. eLife, 6:e16059

II Åberg E, Saccoccia F, Grabherr M, Ore WYJ, Jemth P,

Hultqvist G (2017). Evolution of the p53-MDM2 pathway. BMC Evolutionary biology, 17(1):177

III Åberg E, Andersson E, Jemth P (2018). Conservation and di-

vergence in the evolution of binding affinity between p53 and MDM2. Manuscript

IV Åberg E, Karlsson AO, Andersson E, Jemth P (2018). The

binding kinetics of the intrinsically disordered p53 family trans-activation domains and MDM2. Manuscript

Reprints were made with permission from the respective publishers.

Page 6: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan
Page 7: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

Contents

Introduction ..................................................................................................... 9Evolution of Proteins .................................................................................. 9

Whole Genome Duplication ................................................................ 10Intrinsically Disordered Proteins .............................................................. 11Background to the Thesis Project ............................................................. 12

The NCBD-CID Interaction ................................................................ 12The p53-MDM2 Interaction ................................................................ 13

Aim ........................................................................................................... 15

Materials and Methods .................................................................................. 16Evolutionary Analysis of Proteins ............................................................ 16

Sequence Identification ....................................................................... 16Sequence Alignments and Phylogenetic Analysis ............................... 16Ancestral Sequence Reconstruction and Resurrection ........................ 17

Experimental Analysis of Protein-Protein Interactions ............................ 18Isothermal Titration Calorimetry ......................................................... 19Stopped Flow Fluorimetry ................................................................... 20

Experimental Analysis of Protein Secondary Structure ........................... 22Circular Dichroism .............................................................................. 23

Results ........................................................................................................... 25The Evolution of the NCBD-CID Interaction .......................................... 25The Evolution of the p53-MDM2 Interaction .......................................... 28

Concluding Remarks and Future Perspectives ............................................. 33

Populärvetenskaplig sammanfattning ........................................................... 35

Acknowledgements ....................................................................................... 38

References ..................................................................................................... 39

Page 8: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

Abbreviations

BLAST Basic Local Alignment Search Tool CD Circular Dichroism CREB Cyclic Adenosine Monophosphate Response

Element Binding Protein CREBBP CREB Binding Protein CID CREBBP Interacting Domain IDP Intrinsically Disordered Protein ITC Isothermal Titration Calorimetry MDM2 Mouse Double Minute 2 NCBD Nuclear Co-Activator Binding Domain NcoA Nuclear Receptor Co-Activator PTM Posttranslational Modification TFE Trifluoroethanol WGD Whole Genome Duplication

Page 9: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

9

Introduction

When people ask me what I do, I usually answer “It’s just like in the movie Jurassic Park, I resurrect the past to examine it in the present”. In this thesis, I will give you the longer version of the answer to that question. I think of myself as a biochemical archaeologist, a person who digs through the past trying to solve evolutionary puzzles. Here, I will present a combination of computational and experimental studies of intrinsically disordered proteins (IDPs) and their evolution. I have utilized a method called ancestral se-quence reconstruction with the aim to resurrect proteins that existed hun-dreds to thousands million years ago. The objective has been to follow an interaction involving intrinsically disordered proteins from an ancestor to present day species and examine biophysical properties such as affinity and (lack of) secondary structure. Intrinsic disorder is a common feature in pro-teins responsible for regulation and cell signaling, these functions are crucial for maintenance of vital cellular processes. It is therefore of high interest to acquire more knowledge about how intrinsically disordered proteins interact with other proteins and understand the underlying mechanisms. In addition, understanding the evolution of proteins containing disorder could clarify why disorder appears to have become of such importance. The novelty in the content of this thesis is the combination of ancestral reconstruction and in-trinsically disordered proteins, which to my best knowledge never been con-ducted before. The work in this thesis have contributed with evolutionary analysis and experimental data of interactions involving intrinsically disor-dered proteins.

Evolution of Proteins Proteins evolve through gene duplication and divergence events (1). Gene duplications are a form of genetic backup, which creates raw genetic materi-al and the possibility for novel protein function. Duplicated genes from a common ancestor are called sister genes and constitute a gene family. Genes present in the same genome are called paralogs while related genes in differ-ent organisms separated by speciation events are called orthologs. The fate of a duplicated gene is dependent on random mutations and selection pres-sure and can therefore take different routes. The most common fate of a du-plicated gene is the disappearance from the genome (2). However, retention

Page 10: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

10

of genes leads to functional divergence. There are different kinds of func-tional divergence, subfunctionalization and neofunctionalization. Subfunc-tionalization is an event where a division of the ancestral functions can take place. Another route is neofunctionalization, where one of the sister genes acquire a new function not present in the ancestor (3). The preservation of genes can also happen through an initial subfunctionalization, extending the time and possibility for neofunctionalization (4).

Whole Genome Duplication Whole genome duplications (WGDs) are a result of either autopolyploidy or allopolyploidy. In autopolyploidy two fertilized diploid oocytes fuse and form a single cell which has two sets of chromosomes. Whereas, in allopol-yploidy the chromosomes are duplicated but the cell does not divide. WGD is common in plants (5) and has also occurred within the animal lineage (6–8). In an early vertebrate, two WGDs, entitled 1R and 2R, resulted in four copies of the entire genome. The exact time of the WGD events is estimated to be around 500-430 million years ago, however, it is still debated whether the cyclostomes (lampreys and hagfish) have gone through the 2R duplica-tion (9–11). In the teleost fish lineage, a third round of whole genome dupli-cation took place (12, 13) (Fig 1). The retention of genes after a WGD event is higher compared to small scale duplications, this has given an extraordi-nary opportunity for expansion and divergence of species and new functions (14–17).

Figure 1. A tree depicting the three rounds of whole genome duplications in the vertebrate lineage, entitled 1R, 2R and 3R. The 1R and 2R gave rise to four copies in early vertebrates while the 3R occurred in the fish lineage. Thus, a total set of eight copies of a protein can be found in fish species.

Page 11: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

11

Intrinsically Disordered Proteins The basis of structural biology is that a function of a protein is determined by is structure, which in turn is determined by its sequence. However, a large fraction of the proteome does not fold into a well-defined structure but in-stead is disordered, i.e. populates an ensemble of different conformations. Intrinsic disorder was long thought of as non-functional. However, in the past 20 years disorder has been well studied and it has become increasingly clear that functional disorder is a common and essential feature in protein-protein interactions. Proteins exist and can natively act in a structural contin-uum, ranging from disordered regions of high flexibility, temporary structur-al elements, dynamic molten globules, disordered linkers to completely fold-ed proteins (18). The discrepancies in structural content can be described by the difference in amino acid composition. Intrinsically disordered regions are enriched in amino acids with charged or polar characteristics and comprise less hydrophobic residues (19, 20). The depletion of hydrophobic residues makes it difficult for IDPs to form a hydrophobic core, which constitutes the basis of structured proteins. Instead of folding into a specific tertiary struc-ture, IDPs sample numerous conformations. This can be displayed in a free energy landscape of an IDP, where the landscape is flatter and more rugged compared to structured proteins (21). Interactions involving IDPs are often mediated through motifs (22). When IDPs bind to an interaction partner, they may undergo a transition from disorder to order (23). This capability allows IDPs to interact with multiple partners, and can even adopt different bound structures to different interaction partners (24). Intrinsic disorder is common in proteins associated with regulation of cellular control mecha-nisms and signaling proteins such as transcription factors (25, 26). In such regulatory pathways, posttranslational modifications (PTMs) are a way to switch on and off functions. Therefore, it is not surprising that PTMs are enriched in disordered regions (27).

The abundance of IDPs is persistent in evolution and IDPs can be found in all domains of life, however disorder is more common in eukaryotic or-ganisms (28–30). The complexity of the eukaryotic proteome is associated with the presence of disordered proteins. For example, in eukaryotes alterna-tive splicing and, insertions and deletions are frequent in proteins containing disorder (31, 32). New domains which are a result of non-coding regions and have acquired functionality is often disordered (33). In yeast, the gain of disorder facilitates divergence and innovation after whole genome duplica-tion (34). Furthermore, protein enriched in PTMs have a higher retention after WGDs (35) and are often sensitive to gene expression (15) also corre-lated with disorder (36). IDP are proposed to rewire interactions networks more rapidly than ordered proteins (37).

The constrains on the evolutionary process is less for IDPs as compared to structural proteins (38–40). Though, disordered regions can be divided

Page 12: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

12

into different subcategories: Non-conserved, flexible and constrained (41). For example, disordered regions that can form secondary structure are con-served to a higher extent (42). In less conserved disordered regions, conser-vation of the chemical composition is however preserved (37), as well as length (43), motif and PTM sites (44).

Background to the Thesis Project In order to examine the evolution of intrinsically disordered proteins two model systems were chosen to study in detail. The first model system in-cludes the interaction between the intrinsically disordered domains NCBD and CID. Whereas, the second involves the intrinsically disordered transacti-vation domain in p53 and its structured binding partner MDM2. These sys-tems were selected since they act in transcriptional regulation and are re-sponsible for decision making in cellular processes, where their disordered nature appears to be of importance.

The NCBD-CID Interaction The nuclear co-activator binding domain (NCBD) is present in the cyclic adenosine monophosphate response element binding protein binding protein (CREBBP) and in the CREBBP paralog, p300. The CREBBP/p300 protein family contain a large proportion of disorder, however they also comprise structurally ordered domains such as Taz1, KIX, Bromo, PHD, HAT, ZZ and Taz2 (Fig 2A). The CREBBP/p300 proteins are hubs and can interact with around 400 proteins (45) and are involved in normal development (46, 47).

The NCBD domain interacts with the CREBBP interacting domain (CID) in the nuclear receptor coactivator (NcoA) protein family. This protein fami-ly include NcoA1 (also called SRC-1), NcoA2 (also called TIF2) and NcoA3 (also called ACTR). The NcoA protein family also comprise the following domains: PAS, ST, RID, and HAT (Fig 2B). The CREBBP/p300 and NcoA protein families act together in transcriptional regulation as co-activators (48). These protein families possess histone acetyltransferase activity and therefore has the ability to acetylate histones making DNA more accessible to transcription factors resulting in enhanced gene expression (49, 50).

Figure 2. Domain structure of A) the CREBBP/p300 protein family with NCBD highlighted in blue and B) NcoA protein family with CID highlighted in yellow.

Page 13: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

13

The disordered domain NCBD in CREBBP/p300 and CID in NcoA are in-teraction partners. Their interaction has been well studied in humans to ex-plore the fundamentals of interactions between intrinsically disordered do-mains. Previous studies have shown that NCBD adopts a molten globule structure with a small hydrophobic core (51) while CID has a more extended disorder with transient helical structure. The interaction between CREBBP NCBD and NcoA3 CID display a complex with fast association followed by conformational changes (52). The structure between CREBBP NCBD in complex with NcoA1 CID (53) and NcoA3 CID (54) reveal that NCBD forms a three-helical bundle, whereas the two CID domains assume three helixes with slightly different conformations. Though, the first helix in NCBD and the first helix in NcoA1 and NcoA3 CID adopt similar position (Fig 3).

Figure 3. NMR structure of CREBBP NCBD (blue) bound to A) Ncoa1 CID (or-ange) (PDB-code: 2C52) and B) NcoA3 CID (yellow) (PDB-code: 1KBH).

The interaction of the first helices is mediated through a LxxLL motif, pre-sent in both NCBD and the CID domains. This motif is common in protein-protein interactions of transcriptional regulating proteins (55). Mutational studies to increase the helical propensity in the first helix of CID alter the kinetics in binding to NCBD (56).

The p53-MDM2 Interaction The p53 protein, often referred to as the guardian of the genome, is a tran-scription factor responsible for regulating the fate of a cell (57, 58). The expression of p53 is regulated through ubiquitination by MDM2, initiating degradation by the proteasome (59–61). The activation of p53 is a result of various stress signals leading to phosphorylation, which in turn will trigger apoptosis or cell cycle arrest (62–64). The p53 protein is mutated in approx-imately half of all tumors, and therefore fails to execute its tumor suppress-ing abilities (65, 66).

Page 14: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

14

The p53 proteins contains a disordered transactivation domain (TAD), a DNA binding domain, an oligomerization domain and a C-terminal regulato-ry domain. The two paralogues of p53, p63 and p73, lack the CRD and in-stead possess a sterile alpha motif domain (Fig 4A). The MDM2 protein family also includes MDM4, and comprises a p53-binding domain (p53-BD), an acidic domain, a zinc finger domain and a RING finger domain (Fig 4B).

Figure 4. A) Domain structure of the p53 protein family, where the CRD is present in p53 and the SAM in p63 and p73. TAD is highlighted in red. B) Domain structure of the MDM2/MDM4 protein family with the p53-BD highlighted in blue.

The binding of p53 and MDM2 initiates the degradation of p53 in healthy cells. This interaction involves the intrinsically disordered TAD in p53 and the structured p53-BD in MDM2. Upon binding the p53 TAD forms an α-helix that fit into a hydrophobic cleft in MDM2. Three hydrophobic residues in the p53 TAD helix, FxxxWxxL, constitute the binding motif (67, 68) (Fig 5).

Figure 5. The crystal structure of a peptide derived from human p53 TAD (red) and the p53-binding domain in MDM2 (blue). The binding motif (FxxxWxxL) in p53 TAD is highlighted (PBD code: 1YCR).

The N-terminal of the p53-BD in MDM2 is very flexible and in the unbound state acts as a lid in the p53 binding pocket, upon binding the flexible linker is replaced by p53 TAD (69, 70). The binding motif in p53 TAD is crucial to maintain the affinity to MDM2, when mutating only one of these residues the p53 is unable to bind MDM2 (71). The helicity is influenced by prolines adjacent to the motif in p53 TAD, and substitution of the prolines to alanine increase both the helicity and affinity (72, 73). Furthermore, the phosphory-lation of Thr18 in p53 TAD abolishes binding to MDM2 initiating the acti-

Page 15: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

15

vation of p53 (74). This imply a well optimized fine-tuning of the binding properties, conformation and regulation in the p53-MDM2 interaction.

Aim The aim of this thesis is to examine and understand interactions involving intrinsically disordered proteins from an evolutionary perspective. The ques-tions that have been addressed are how disordered regions in proteins have evolved with emphasis on the evolutionary trajectory on a molecular level, and how disorder affects affinity to its binding partner. To answer these questions, I have used evolutionary biochemistry, which is a combination of ancestral sequence reconstruction and biophysical characterization. I have worked with two model systems:

The first model system involves the dynamic alpha helical molten globule NCBD present in the CREBBP/p300 protein family and its binding partner, the intrinsically disordered CID in the NcoA protein family. The emergence and evolution of the interaction was exam-ined with further biophysical characterization of secondary structure, affinity and dynamics (Paper I). The second model system includes the intrinsically disordered trans-activation domain of p53 and its structured negative regulator, the p53-binding domain of MDM2. Three separate studies were con-ducted. First, the evolution of p53 and MDM2 was examined by studying the conservation in species and their evolutionary relation-ship (Paper II). Second, ancestral sequence reconstruction and resur-rection was performed to elucidate the binding affinity at different evolutionary time points (Paper III). Finally, the binding kinetics at different ionic strength of the human paralogues p53, p63 and p73 binding to MDM2 was examined (Paper IV).

Page 16: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

16

Materials and Methods

Evolutionary Analysis of Proteins The advances in genome sequencing has led to faster and cheaper analysis. Together with improvements of phylogenetic methods, this have opened up possibilities in the field of evolutionary biology. The increasing number of sequenced genomes can be utilized to retrieve information about the conser-vation of a gene or gene family in different species and can help improve our understanding of evolutionary trajectories and relationships.

Sequence Identification The identification of sequences from the same gene family can be achieved by searching in various genome databases, for example the Ensembl genome browser. The Ensembl genome browser (ensembl.org) comprise several ver-tebrate genomes (75). The EnsemblCompara GeneTree function identifies paralogue and orthologue sequences of the same gene family (76). Ensem-blMetazoa (metazoa.ensembl.org) is a sister site to Ensembl and include invertebrate animals (77). To identify sequences within a gene family, the Basic Local Alignment Search Tool (BLAST) can be used (78). BLAST can identify regions with similar amino acid sequence in a genome database. The BLAST algorithm compares the input sequence with sequences in a database and gives a score of the similarity and a statistical value which corresponds to the hits expected to occur by chance in the searched database. In this the-sis BLAST was used in the following databases: The National Centre for Biotechnology Information (NCBI) (ncbi.nml.nih.gov), EchinoBase (ech-nionbase.org) (79) SkateBase (skatebase.org) (80) and Japanese Lamprey Genome Project (jlampreygenome.imcb.a-star.edu.sg) (81).

Sequence Alignments and Phylogenetic Analysis Multiple sequence alignment algorithms are used to align protein sequences to evaluate the conservation of different regions and binding motifs. A mul-tiple sequence alignment is required to infer phylogenetic relationships. Phy-logenetics is used to study the evolutionary history of a group of species. A phylogenetic tree is a branching diagram that symbolizes the evolutionary

Page 17: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

17

relationships of different sequences. The tips of the tree represent present-day species, while the nodes of the tree represent ancestors.

In this thesis, protein sequences were aligned using Guidance (guid-ance.tau.ac.il) (82, 83) with the MAFFT algorithm (84). Guidance gives an alignment confidence score for each column, residue and sequence. The score represents the robustness to perturbations in the output alignment. The Guidance algorithm also offers the possibility to remove columns and se-quences and mask residues below a certain cut-off value. Gaps in the align-ment were removed. The best amino acid substitution model was selected using MEGA6 (85, 86). The alignments were further used to calculate phy-logenetic trees where the Phylogenetic Maximum Likelihood was used (87). The tree was supported by a non-parametric SH-like approximate likelihood ration test (88).

Ancestral Sequence Reconstruction and Resurrection Ancestral sequence reconstruction is a method where the most recent com-mon ancestor of a protein can be predicted. The idea of reconstructing ances-tral sequences was suggested already in 1963 (89), however it took about 30 years of technical advancement until reconstruction became reality (90–92). Ancestral sequence reconstruction is based on an alignment of proteins from present-day species and their phylogenetic relationship. The alignment and the phylogenetic tree is based on the entire protein even though a single do-main is to be reconstructed. When using this method, it is important that the phylogenetic tree matches the evolution of species. It is also important that the domain subjected to reconstruction is well aligned. Transitive consisten-cy score can be used for evaluation of different alignments (93). The predic-tion of the ancestral sequences was conducted by using maximum likelihood, which takes branch lengths and relative rates of substitutions into account (94). The output gives the probability for each amino in each position. In the case of low probabilities, several probable amino acids can be resurrected to test different ancestral states. The resurrection part involves conventional methods like cloning, protein expression and purification to obtain a pure protein. The proteins can thereafter be characterized for different biophysical properties (Fig 6).

Page 18: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

18

Figure 6. Illustration of ancestral sequence reconstruction and resurrection. An alignment and a phylogenetic tree of present-day species is used to predict an ances-tor in a node. The predicted ancestral sequence is resurrected through synthesis of the coding DNA, followed by protein expression and purification. The pure protein can then be subjected to biophysical characterization (Figure from Paper I).

Experimental Analysis of Protein-Protein Interactions When two proteins bind to each other a second order reaction occurs, associ-ation and dissociation of a complex:

Binding events have thermodynamic components, enthalpy (H) that reflects the heat correlated to the strength of the bonds, entropy (S) which mirror the dynamics and solvent effects. The sum of enthalpy and entropy gives the Gibbs free energy (ΔG) of the interaction i.e. the difference in bound and unbound state:

where ΔH refers to the change in enthalpy, T the temperature and ΔS the change in entropy upon binding.

Page 19: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

19

Gibbs free energy can also be written in terms of the equilibrium dissocia-tion constant:

where R is the gas constant, T is the temperature and Kd is the equilibrium dissociation constant (Fig 7).

Figure 7. Illustration of how the Gibbs free energy is related to different kinetic rate and equilibrium constants.

Isothermal Titration Calorimetry Isothermal titration calorimetry (ITC) is a technique to study protein-protein interactions. When two proteins interact, thermodynamic changes occur and ITC can be used to determine the thermodynamic parameters (ΔH, ΔS, ΔG) of the reaction. However, the primary aim of ITC is to determine the equilib-rium dissociation constant (Kd) and the number of binding sites (n). The ITC will estimate the association equilibrium constant (Ka), which is the inverse dissociation equilibrium constant. Ka equals the concentration of the com-plex divided by the concentration of the unbound proteins:

The experimental setup involves stepwise injections of a protein placed in a syringe to its binding partner in a calorimetric sample cell. A reference calo-rimetric cell is kept at constant temperature and after each injection the tem-perature in the sample cell changes, that afterwards is adjusted back to main-tain minimal difference to the reference cell (Fig 8A). For each injection i, the heat (qi) of the reaction is measured and then is related to the concentra-tion of the formed complex:

Page 20: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

20

where V is the calorimetric cell volume, ΔHa the enthalpy of binding and [AB] the concentration of the complex at injection i. As the injections pro-ceed, the complex concentration increases and the heat of each injection decreases until complete saturation (Fig 8B).

Figure 8. A) The setup of the experiment is a titration of a protein placed in a sy-ringe to its binding partner in a cell at constant temperature. The change in tempera-ture is controlled via a reference cell. B) The change in heat when two proteins in-teract is measured in every injection. With every injection, the concentration of the protein complex increases until the protein in the cell is saturated and no change in heat can be observed. C) The change in heat over time is integrated to kcal/mol, plotted against the molar ratio and fitted to a sigmoidal function where the affinity (slope of the curve), enthalpy (difference in energy content of the bound and un-bound state) and stoichiometry (midpoint of the transition) of the protein interaction can be assessed.

The amount of heat released/absorbed at each injection is integrated and plotted against the molar ratio of the two interacting proteins. The resulting plot is fitted using non-linear regression, where the following parameters can be estimated: The equilibrium association constant (Ka) which is equal to the slope, the number of binding sites (n) which corresponds to the midpoint of the slope and enthalpy of the binding reaction (ΔHa) which is the total heat (Fig 8C). From these values, the free energy and the entropy of the reaction can be calculated. ITC can measure binding affinities in the nanomolar to micromolar range. Advantages of the method are that it is label-free and in solution.

Stopped Flow Fluorimetry Stopped flow fluorimetry is a method for studying the kinetics of protein-protein interactions. From this type of experiment the association rate con-stant (kon) and dissociation rate constant (koff) of the binding reaction can be determined.

Page 21: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

21

The experimental system involves rapid mixing of two interaction partners in a cell where monochromatic light is passed through (Fig 9A). To monitor the binding event in the cell, one of the interaction partners requires a fluo-rescent probe that is affected by the interaction. The monochromatic light excites the fluorescent probe and when binding occurs will give rise to an increase/decrease of the emitted light. This results in a plot of the emitted light over a set time period (Fig 9B). In the case of a two-state mechanism, the trace in the plot can be fitted to a single exponential equation, from which the observed rate constant can be extracted:

Fluorescence emission where the fluorescence emission corresponds to the concentration of protein complex over time, a is the fraction of A bound to B, the observed rate con-stant, kobs, which is related to the slope of the curve and C, the fluorescence emission signal at equilibrium.

Figure 9. A) The experimental setup is two injection syringes with one binding partner in each. The drive ram pushes the binding partners in to a mixer where mon-ochromatic light excites the sample, at the same time the fluorescence emission is recorded. B) The fluorescence emission is measured over time and fitted to an expo-nential function to extract the observed rate constant of the binding reaction. C) The observed rate constant is plotted against varying concentrations of one binding part-ner where the association rate constant and the dissociation rate constant can be estimated by fitting.

The association rate constant is determined by the slope of the observed rate constants at different concentration of B. If the experiment is done under true pseudo-first order conditions, the data can be fitted to a straight line:

Page 22: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

22

However, the conditions for pseudo-first order is not satisfying at low con-centrations of B, which often is the case in experimental setups. Instead, the data can be fitted to an equation for a reversible bimolecular association (95) (Fig 9C).

The association and dissociation rate constants can be extracted from the equation by curve fitting. If the dissociation rate constant is close to zero, separate displacement experiments can be conducted. Displacement experi-ments are performed by mixing a preformed complex (AB) with high con-centration of a displacer (C) that has affinity to A. This will result in a com-petitive binding to A, in which C will displace B and a AC complex will be formed.

At high concentrations of C, the association back to AB is negligible and the resulting kobs value is therefore approximately the dissociation of the AB complex. A displacement reaction will result in an observed rate constant that equals:

where the Kd represents the equilibrium dissociation constant for the AC complex.

Experimental Analysis of Protein Secondary Structure Proteins are built up by polymers of amino acids with different characteris-tics. The order of the amino acids in a protein is referred to as the primary structure. Proteins can fold into α-helices and β-sheets defined by different hydrogen bonding, called secondary structure. The organisation of the sec-ondary structure elements within a protein is driven by the hydrophobic ef-fect, termed tertiary structure. Determination of protein structure and how perturbations affect the structure can give information about the binding mechanism and stability of the protein.

Page 23: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

23

Circular Dichroism Circular dichroism spectroscopy (CD) is a technique to study protein sec-ondary structure and how it changes upon various perturbations. This meth-od utilizes the chiral properties of amino acids and their difference in absorp-tion of left and right polarized light in the far-UV region (Fig 10A). A poly-peptide induces an overall chirality that will result in a characteristic shape depending on the secondary structural features of a protein (Fig 10B). The signal from a CD experiment is presented in millidegrees (mdeg), which is the difference in absorbance of left and right-handed polarized light. This unit is often converted to molar ellipticity (deg cm2 dmol-1) that corrects for the protein concentration and the path length of the cuvette. Circular dichro-ism can estimate the overall secondary structure of a protein and how it changes upon different perturbations for example temperature, denaturant or helix-inducing agent. Perturbations of a protein will result in a transition of the native to a non-native state. Experiments with increasing perturbation can be used to determine the midpoint of unfolding/folding. (Fig 10C). The observed signal will be a mix of the native (N) and denatured (D) state:

Observed signal

where f is the fraction that is occupied by the state and s is the signal in that state. At equilibrium, the native and denatured state are occupied to the same extent and the equation can be rewritten in terms of thermodynamic parame-ters:

Observed signal =

where P is the degree of perturbation, αN/D is the signal from the state and βN/D is the rate of intrinsic change in the signal from the state with increasing P; .

For a chemical perturbation, the free energy of unfolding equals:

where m is the slope of the transition and [P]50% is the concentration of the perturbing agent at the transition midpoint.

For a thermal denaturation, the free energy of unfolding equals:

Page 24: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

24

where T is the temperature, Tm the midpoint of thermal denaturation, ΔHm the enthalpy of denaturation at the transition midpoint and ΔCp the change of heat capacity of denaturation.

Figure 10. A) Circular polarized light is passed through a protein sample where the light is absorbed to different extent at different wavelengths. B) The ellipticity is recorded over a set wavelength or wavelength interval, the curve gives a characteris-tic shape depending on the secondary structure: α-helix (blue), β-sheet (green) and random coil (red). C) The perturbation, for example urea, at a specific wavelength is plotted against fraction unfolded and fitted to extract the midpoint of unfolding.

Page 25: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

25

Results

The Evolution of the NCBD-CID Interaction Paper I describe the ancestral reconstruction, biophysical characterization and molecular simulation to shed light on the evolution of the interaction between NCBD and CID. NCBD was found to be present in deuterostome and protostome species, deuterostome species include vertebrates, starfish and acorn worms while protostome species are invertebrates like snails, worms and insects. CID appeared later in evolution and is only present in deuterostome species. Both proteins existed when the two whole genome duplications occurred and resulted in four copies of each protein. The CREBBP/p300 protein family retained two copies: CREBBP and p300 while the NcoA protein family preserved three copies, named NcoA1, NcoA2 and NcoA3. In the fish lineage, the third genome duplication produced two addi-tional copies of the protein which are conserved in both the CREBBP/p300 and NcoA1-3 protein families (Fig 11).

In this study, we resurrected sequences in different nodes of the phylogenetic tree. The resurrected sequences represent the ancestral state. For NCBD the following ancestral sequences where resurrected: The ancestor in the node where deuterostomes diverge from protostomes (D/P), the ancestor in the node for the WGD (denoted 1R/2R) and the ancestor of fish and tetrapods (Fish/Tetrapods) in the CREBBP lineage. Furthermore, NCBD domains from extant species were used in the analysis, namely those from human CREBBP and p300, a fruit fly (Drosophila melanogaster), a fish in the CREBBP lineage (Danio rerio) and a lamprey (Petromyzon marinus). For CID, we resurrected the ancestor in the node of the first and second WGD (denoted 1R and 2R) and the ancestor of fish/tetrapod in the NcoA3 lineage. The CID domains from human NcoA1, NcoA2 and NcoA3 were also in-cluded in the analysis. For the resurrected sequences, positions with a low probability were tested as alternative ancestral states.

Page 26: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

26

Figure 11. Schematic species tree representing the presence of the NCBD (blue) and CID (yellow) in different phyla. The resurrected ancestral states are named and illustrated as circles in the nodes (Adapted from figure 2 in Paper I).

The affinity of all contemporary complexes including alterative ancestral sequences of NCBD and CID was measured with isothermal titration calo-rimetry. At the time point of the split of deuterostome and protostome spe-cies as well as in fruit fly, the CID domain did not exist. In the most ancient complex, the interaction between NCBD D/P and CID 1R, the equilibrium dissociation constant was estimated to 1.5-18 µM. In fruit fly, similar affini-ties are observed to different CID variants. This reveals that NCBD and CID are able to interact with each other, even though they are not co-existing in the same proteome. At the time point of the two whole genome duplications, 1R and 2R, NCBD has recruited CID as a binding partner and the affinity is increased around 25-fold. The strength of the interaction is preserved in the ancestor of fish/tetrapod as well as in the extant species of human, fish and lamprey (Fig 12A). This suggests a fast adaptation and a retained affinity for the NCBD-CID complex in evolution.

The NCBD variants where examined for their secondary structure with cir-cular dichroism. The examination showed that all NCBD variants displayed a very similar shape of a helical spectrum and indicate that a similar helical content has been present during evolution. To further test the secondary

Page 27: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

27

structure stability of the NCBD domains, experiments with increasing tem-perature and urea concentration were conducted. For a globular protein, one expects data points following a sigmoidal curve where the midpoint of the slope corresponds to the equilibrium of the native and denatured state. The temperature stability measurements did not show a sharp transition for any variant. This makes it difficult to determine the thermal midpoint for dena-turation. The titration of urea indicates that the NCBD D/P has a slightly lower stability compared to the evolutionary younger NCBD 1R/2R, CREBBP fish/tetrapod and extant variants. However, the curves display a very broad transition (Fig 12B). The difficulty of determining the stability of NCBD is probably because of its small hydrophobic core and dynamic prop-erties.

The CID domains were also subjected to circular dichroism experiments. The spectra display a random coil with a transient helical structure for all variants. Furthermore, examination of the propensity to adopt a secondary structure was performed. To induce a coil-helix transition in the CID vari-ants, different concentrations of 2,2,2-Trifluoroethanol (TFE) was added. The TFE titration displayed no difference in helical tendency for the CID variants (Fig 12C). This imply that the disordered characteristics of CID is preserved from the ancestor in 1R to extant species.

Figure 12. A) The relative affinity between NCBD and CID variants, against the interaction of human p300 NCBD and NcoA2 CID. B) Urea denaturation of NCBD variants. C) TFE titration of CID variants. (Adapted from figure 5 in Paper I)

To further investigate the dynamic properties in the evolution between NCBD and CID bias-exchange metadynamics simulations were performed. The simulations are based on chemical shift determined by NMR. Three complexes were chosen: The most ancient complex (D/P NCBD - 1R CID), the 1R/2R complex (1R/2R NCBD - 1R CID) and one extant complex (the human CREBBP NCBD - NcoA3 CID). The free energy as a function of the fraction helix and the Radius of gyration was calculated and analyzed. The

0

0.2

0.4

0.6

0.8

1

D/P

1R/2

R

1R/2

R

Fish

/tetra

pod

Hsa

CR

EB

BP

Hsa

CR

EB

BP

Hsa

CR

EB

BP

Hsa

p300

Hsa

p300

Hsa

p300

Dre

CR

EB

BP

Pm

a

Dm

el

Rel

ativ

e af

finity

of N

CB

D/C

ID in

tera

ctio

ns

1R 2R Fish

/tetra

pod

1R Hsa

NC

OA

1

Hsa

NC

OA

2

Hsa

NC

OA

3

Hsa

NC

OA

1

Hsa

NC

OA

2

Hsa

NC

OA

3

Hsa

NC

OA

1

Hsa

NC

OA

1

Hsa

NC

OA

1

NCBD variant

CID variant

A CB

0

0.2

0.4

0.6

0.8

1

1.2

0 1 2 3 4 5 6 7 8

D/PD/P T2062I1R/2RFish/tetrapodHsa CREBBPHsa p300DrePmaDmel

Nor

mal

ized

mol

ar e

llipt

icity

at 2

22 n

m

[Urea] (M)

0

0.2

0.4

0.6

0.8

1

1.2

0 10 20 30 40 50

1R2RFish/TetrapodHsa NCOA1Hsa NCOA2Hsa NCOA3

Fra

ctio

n he

lix

[TFE] (% v/v)

Page 28: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

28

most ancient complex displays a higher degree of conformational heteroge-neity than the younger 1R/2R complex and the heterogeneity is further re-duced in the extant complex. Additional measurements of the root mean square fluctuation display decreasing fluctuations in the younger complexes compared to the more ancient one. To examine the number of interactions the complexes are able to make an interface contact analysis was performed. The number of contacts between ancient and extant complexes demonstrate a reduced number of contacts with a higher strength in the extant complex compared to the ancient complexes where many contacts are made but with less strength.

In conclusion, the molecular simulations and the affinity measurements sug-gest that a higher affinity is correlated with a reduced conformational heter-ogeneity. The most ancient complex has more contacts with less strength and over time the optimization of the binding has led to fewer but stronger inter-actions. The evolution of the interaction has led to a decreased conforma-tional heterogeneity and a higher affinity.

The Evolution of the p53-MDM2 Interaction In paper II, the evolution of the p53/p63/p73 and MDM protein families with focus on the interaction domains was examined, that is, the transactivation domain (TAD) and the p53/p63/p73-binding domain (p53/p63/p73-BD). The p53/p63/p73 family comprise three proteins, p53, p63 and p73 while the MDM family include MDM2 and MDM4. The emergence of the proteins within the families are a result of whole genome duplication events that took place in an early vertebrate. In the third WGD event in the fish lineage, the second copy was lost for all paralogues of p53 and MDM2. Before the du-plication only one copy was present, denoted p53/p63/p73 and MDM. The p53/p63/p73 TAD is consistently detected in deuterostome species within all paralogs. In protostome species, distinct phyla have lost the TAD. The TAD is preserved in the phyla comprising annelids (ringworms), mollusks (snails and mussels) and in the arthropod subphyla myriapods (centipedes) and che-licerates (spiders and horseshow crabs). In the arthropod subphyla crusta-ceans (crabs and cray fish) and hexapods (insects and flies), and nematode (roundworms) a p53/p63/p73 protein is present. However, TAD is missing but the DNA-binding domain and the oligomerization domain is conserved. Interestingly, the MDM p53/p63/p73-BD follows the same pattern as p53/p63/p73 TAD and is found in the same phyla within deuterostomes and protostomes. In a similar manner to p53/p63/p73, the MDM protein can be found in a few arthropod species, where the acidic, zinc-binding and RING domain is conserved. However, the entire protein is completely lost in nema-todes (Fig 13).

Page 29: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

29

Figure 13. Schematic species tree displaying the presence of the TAD (red) and p53/p63/p73-BD (blue). Grey lines indicate no presence of TAD and p53/p63/p73-BD (Adapted from figure 2 in Paper II).

In cnidarian species, the interaction domain of both p53/p63/p73 and MDM cannot be found, and in sponges the p53/p63/p73 and MDM proteins are completely missing. Nonetheless, the complete p53/p63/p73 and MDM are present in the multicellular organism Trichoplax adhaerens. This suggests that the interaction between TAD and p53/p63/p73-BD dates back to the beginning of animal life and have since then tightly co-evolved. The results show that species in distinct phyla have either both interaction domains con-served or both have disappeared. After this realization, phylogenetic trees were calculated, only including species with a conserved interacting domain. The resulting trees of the p53/p63/p73 and MDM protein families both fol-low the species evolution. In the case where all sequences with similarity to p53/p63/p73 and MDM were used, the phylogenetic tree did not show an accordance with the evolution of species.

Page 30: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

30

In paper III, ancestral reconstruction and biophysical measurements were used to examine the affinity between the intrinsically disordered transactiva-tion domain of p53 and the p53-binding domain of MDM2 at different time points in evolution. Due to high sequence divergence in the MDM family, reconstruction was only conducted in the vertebrate lineage of p53 and MDM2. The p53-MDM2 complex was examined at five different evolution-ary time points: In the ancestor of fish and tetrapods, in the ancestor of rep-tiles/birds and mammals and in the present-day species Danio rerio (zebrafish), Gallus gallus (chicken) and Homo sapiens (human).

The affinity of the contemporary complexes was calculated from the associa-tion and dissociation rate constants determined by stopped flow fluorimetry (Fig 14). The binding affinity in the ancestor of fish and tetrapods was measured to 70-200 nM. Following the evolution in the tetrapod lineage the affinity was preserved at 170-200 nM in the ancestor of reptiles/birds and mammals. The p53-MDM2 complex in the chicken Gallus gallus displayed an affinity of 80 nM, which is a 2-fold increase compared to the rep-tiles/birds and mammals ancestor. In humans, a 10-fold increase in affinity can be measured as compared to the ancestor. However, in the fish lineage represented by the zebrafish Danio rerio, the affinity was too high to be measured, with a dissociation rate constant of approximately 500 s-1. This stands in large contrast to the contemporary complexes in the ancestor of fish and tetrapod and in chicken and human, where the dissociation constant is below 10 s-1. Measurements of MDM2 from Homo sapiens and the ances-tor of fish/tetrapods against p53 from Danio rerio displayed a similar disso-ciation rate constant as the contemporary complex in Danio rerio. On the other hand, the MDM2 from Danio rerio against p53 from Homo sapiens and the ancestor of fish/tetrapods have affinities of 90 nM.

Page 31: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

31

Figure 14. Schematic species tree of vertebrates with binding affinity of contempo-rary complexes in blue font. The whole genome duplications are denoted 1R and 2R (Figure 2 in Paper III).

The p53 sequence in Danio rerio differs in two ways from the other p53 peptides in this study. First, there is an insertion of an asparagine residue, resulting in a loss of a leucine in the FxxxWxxL motif. Second, there is a deletion of a threonine located just before the motif. To explore the cause of the decreased affinity, three additional Danio rerio p53 peptides where test-ed. One peptide extends the sequence one amino acid and therefore includes the leucine in the motif (Danio rerio longer), this resulted in an affinity of 1.8 μM. The other two peptides were modified to mimic the consensus se-quence of the other peptides present in the study, one with a substitution of the asparagine to a leucine (350 nM) and the other with an insert of a threo-nine (2.3 μM) (Fig 15). All the modified peptides increased the affinity.

Figure 15. Alignment of the four Danio rerio peptides tested and their affinity to MDM2 in Danio rerio.

To shed further light on the evolutionary trajectory, the paralogs p53, p63 and p73 were examined for their affinity against MDM2. These experiments were performed with a 15 residues p53 peptide, instead of the 12 residues peptide used for the ancestral and extant p53 variants. The reason for this was to facilitate comparison with previously published affinities in this ex-

Page 32: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

32

perimental setup (68, 74, 96, 97). The affinity of the human MDM2 against human p53 was recorded to 100 nM. The discrepancy in affinity of the 12 and 15 residues p53 peptide has been proposed to be a result of the N-terminal flexible lid in MDM2 which occupies the binding pocket in the unbound state of MDM2 (69). The affinity of p73 was measured to 300 nM, while p63 has a much lower affinity of 6.1 μM. This is a good example of how gene duplication result in functional divergence.

In paper IV, the human paralogues, p53, p63 and p73, were further examined for their ionic dependence in the interaction with human MDM2. Kinetic experiments at varying ionic strengths were performed in order to estimate the basal association rate constant, which is in the absence of electrostatic interactions. In contrast to other IDPs, the results displayed a small ionic dependence. The basal association rate constant for p53 and p73 was esti-mated around 2-5 106 M-1s-1, and for p63 it is slightly higher, 13 106 M-1s-1. These values are, however, in accordance with other IDP interactions.

Page 33: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

33

Concluding Remarks and Future Perspectives

The content of this thesis has contributed with an evolutionary perspective of disordered regions and their interactions. The evolution of IDPs are com-monly examined by studying sequence data from present-day species. The development of prediction tools for disordered regions have given plenty of useful information about the conservation of disorder in the different do-mains of life (30) as well as within protein families (98). Our approach to study the evolution of disorder involves ancestral sequence reconstruction, and may give additional information of the time point in evolution important changes has altered a function. Ancestral sequence reconstruction has been utilized to investigate the evolution of structured proteins and has been able to identify several important shifts in function. For example, how promiscu-ous ancestral enzyme evolved to be selective to certain substrates (99) and the evolution of multimeric complexes (100) as well as the origin of feathers in dinosaurs (101). The fast evolution and high sequence divergence of some disordered regions may not be suitable for ancestral reconstruction. Howev-er, when possible it can give useful information of what the past looked like. In this thesis, ancestral disorder has been resurrected and biophysically char-acterized. The evolution of the NCBD-CID complex is an example of how an interaction between IDPs emerged and gained a novel function in tran-scriptional co-regulation. In this case, the disordered features in NCBD and CID may have facilitated the gain of function. We showed that the disor-dered properties in the individual domains are preserved during evolution, however the complex has evolved to be more rigid resulting in a higher af-finity. Molecular simulations indicated that a few residues are responsible for the increase in affinity, therefore ancestral mutations were introduced in human NCBD. However, these results did not remarkably affect the affinity, suggesting that there is more to it than site-specific contacts for the interac-tion to be altered. This may be the case of several IDP interactions. In the evolution of p53 and MDM2, the interaction domains have been pre-served in all deuterostomes, but only in specific phyla in protostomes spe-cies. Additionally, the ancestral reconstruction of MDM2 resulted in many positions with low probability further back in evolution than the vertebrate lineage. This suggests a high substitution rate in species that diverged from the ancestor of deuterostome and protostome, where species that preserved

Page 34: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

34

the domain have gained a beneficial function, and, on the contrary, the spe-cies that lost the domain acquired deleterious mutations abolishing the inter-action. For p53, similar assumptions can be made, the 12-residue peptide used in the study is well conserved, however the adjacent regions differ sub-stantially in sequence in extant species. A mutation of the phenylalanine or the tryptophan in the binding motif has been shown to abolish binding, and if such a mutation occurred during evolution it may have led to the disappear-ance of the whole domain. Within the vertebrate lineages, differences in affinity and salt dependence can be explained by functional diversification.Affinities that slightly differ in tetrapod species can be explained in the same manner. However, in the p53 vertebrate lineage the zebrafish has a surpris-ingly lower affinity compared to contemporary ancestral and extant verte-brate complexes. There are several fish species closely related to the zebra fish with analogous sequence characteristics (Paper III, Suppl. Fig 2). This implies that a large fraction of all fish species has a reduced affinity between p53 and MDM2. If this change in affinity is true, there must either be an altered expression, regulation or other consequence for the cell. Although, studies of zebrafish reveal that p53 and MDM2 have a similar function and are a part of the same pathways (102, 103). The third whole genome duplica-tion in the fish lineage gave additional opportunities for novel function and expansion of species divergence. However, no copies from the p53 or MDM2 family are preserved. On the other hand, copies of downstream tar-gets may have been preserved and may have altered function or expression. In our studies, the main focus has been to examine the affinity of protein-protein interaction involving disorder. Affinity is a quantitative measurement and can give useful information in order to elucidate binding mechanisms, for example when comparing wildtype protein with mutant or ancestral pro-tein with extant variants. IDPs have important functions in regulatory path-ways, and therefore the maintenance of their structure/disorder-function relationship is crucial. It has been proposed that IDPs bind with high speci-ficity and low affinity to account for their regulatory role. Fast association as well as dissociation of a complex is a way of turning on and off a function in a quick manner. However, it is important to put affinity and the interaction in a cellular context. How high are the expression levels, and in which subcel-lular location and environment does the protein-protein interaction occur? To get the whole picture, interdisciplinary efforts combining bioinformatics, biophysical characterization and cell studies are needed. One good example is a study that examined the interplay between disorder and affinity of p53 and MDM2, and how it affected expression levels and functional differences in human cell lines (72). The results show that increasing residual helicity in p53 leads to higher affinity to MDM2 as well as altered dynamics. In a cellu-lar context, this resulted in reduced expression of target genes and loss of the ability to induce cell cycle arrest upon DNA damage.

Page 35: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

35

Populärvetenskaplig sammanfattning

På samma sätt som man kan lära sig om nutiden genom att studera historien, kan man lära sig om proteiners funktion i kroppen genom att studera evolut-ionen. I min forskning har jag undersökt hur ostrukturerade proteiner har förändrats under evolutionen och hur de interagerar med andra proteiner i kroppen. Studien har genomförts genom att återuppliva en proteindomän som existerade för hundratals miljoner år sedan för att sedan jämföra den med nutida varianter. Fokus har varit på att undersöka hur starkt två ostruk-turerade proteindomäner binder till varandra och följa hur de interagerat genom evolutionen. Genom att studera ostrukturerade proteiner och dess evolution kan man lära sig om hur viktiga processer i kroppen fungerar i detalj, vilket i längden kan leda till upptäckter av nya läkemedel för att bota sjukdomar. Proteiner är kroppens arbetare och utför en mängd olika uppgifter som är livsviktiga för att kroppen ska fungera. Proteiner består av aminosyror som är ordnade i en unik följd och sammansättning. Ett protein innehåller oftast flera olika funktionella enheter som kallas för domäner, de är vanligtvis an-svariga för en specifik funktion i proteinet. Aminosyrorna i ett protein kan interagera med varandra och veckas till en specifik struktur som är förknip-pad med den funktion proteinet utför. En tredjedel av alla mänskliga protei-ner veckas inte till någon specifik struktur, vilket innebär att de är ostruktu-rerade och mer flexibla än de strukturerade. Man trodde länge att proteiner som saknade struktur, ostrukturerade proteiner, inte hade någon funktion. Forskning har dock visat att ostrukturerade proteiner är betydelsefulla och utför en mängd viktiga funktioner i cellen som har med signalering och re-glering att göra. Ostrukturerade proteiner har alltså en viktig funktion i krop-pen för de reglerar andra proteiner och kan på detta sätt bestämma vilka funktioner som ska utföras i cellen. Syftet med denna avhandling är att undersöka hur evolutionen av ostrukture-rade proteiner och deras bindningspartners har sett ut. För att studera hur proteiner har utvecklats kan genomet hos arter som lever idag studeras. Ge-nomet består av kromosomer och kan delas upp i olika gener. En gen är en mall för ett proteins sammansättning. När proteiner med nya funktioner ut-vecklas så kopieras en gen i genomet, denna kan ses som en genetisk säker-hetskopia. Två kopior av samma gen ger utrymme för förbättringar eller helt

Page 36: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

36

nya funktioner hos den ena kopian utan att förstöra den ursprungliga funkt-ionen i den andra kopian. Genom att undersöka evolutionen av en gen eller ett protein kan man kartlägga i vilka nutida arter de existerar. Utifrån kart-läggningen skapas fylogenetiska träd som kan liknas vid släktträd över pro-teiner i olika arter. Ett fylogenetiskt träd baseras på en jämförelse av ami-nosyrasekvensen hos protein i olika arter. Känner man till ett proteins ami-nosyrasekvens kan man tillverka det i ett laboratorium och på så sätt åter-uppliva protein som inte längre existerar. Återupplivning är möjligt med hjälp av ett fylogenetiskt träd över nutida arter och deras aminosyrasekvens. I det första projektet undersöktes de ostrukturerade proteindomänerna NCBD och CID. Dessa domäner finns i proteiner som aktiverar tillverkning-en av en mängd olika andra protein som är viktiga för en normal utveckling av foster. En kartläggning av vilka arter ovannämnda proteindomäner finns i visade att NCBD förekommer både i ryggradsdjur och i ryggradslösa djur, medan CID bara finns i ryggradsdjur. Ett fylogenetiskt träd skapades och proteindomäner vid olika tidpunkter återupplivades. Experiment genomför-des sedan för att bestämma bindningsstyrkan mellan NCBD och CID vid de olika tidpunkterna samt hur strukturen och flexibiliteten förändrats i protei-net. Resultaten visade att NCBD från förfadern till ryggradsdjur och rygg-radslösa djur är kapabel att binda till CID, trots att CID inte existerade förrän långt senare. Strukturen av NCBD och CID är väldigt flexibel och består av många men svaga interaktioner. Sedan NCBD och CID började samexistera, har bindningsstyrkan blivit 25 gånger starkare och strukturen blivit mer stel. Denna trend fortsätter i de mänskliga kopiorna av domänerna och en ännu högre bindningsstyrka som beror på färre men starkare interaktioner. Sam-manfattningsvis har vi visat hur interaktionen mellan NCBD och CID upp-stod och att hur bindningsstyrkan ökat genom förändringar av vissa ami-nosyror som har stabiliserat strukturen i det bundna tillståndet. I det andra projektet studerades proteinerna p53 och MDM2. Proteinet p53 kan motverka spridning av cancer genom att aktivera tillverkningen av andra protein. Aktiviteten hos p53 i en cell regleras av ett annat protein som kallas MDM2. I friska celler finns inget behov av p53 och därför hämmas dess funktion av MDM2. I cancerceller aktiveras p53 och förhindrar på så sätt spridning av cancerceller. Dock har forskning visat att i ungefär hälften av alla cancertumörer har p53 muterats och blivit defekt, detta gör att cancercel-ler kan överleva och föröka sig. På grund utav relationen mellan p53 och andra protein är det viktigt att undersöka proteinet p53 och försöka hitta läkemedel som kan återställa proteinets funktion. Proteinet p53 innehåller en domän som är ostrukturerad och är därför intressant att undersöka ur ett evo-lutionärt och biofysikaliskt perspektiv.

Page 37: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

37

Evolutionen av proteinerna p53 och MDM2 undersöktes genom att kartlägga i vilka arter de finns i idag. Denna studie visade att domänerna som binder till varandra i p53 och MDM2 har försvunnit från vissa grupper av arter. I ryggradslösa djur har proteindomänerna försvunnit i alla rundmaskar och i de flesta leddjur men är bevarade i ringmaskar och blötdjur. I ryggradsdjur finns dock både p53 och MDM2 bevarade i alla artgrupper. Efter denna kart-läggning gjordes fylogenetiska träd över p53 och MDM2 där endast de arter som har interaktionsdomänen bevarad inkluderades. I kontrast med andra fylogenetiska träd för p53 och MDM2 följer vårt träd evolutionen av hur arter har utvecklats. Detta betyder att p53 och MDM2 har utvecklats till-sammans och tyder på att de fyller en viktig funktion i de arter som protein-domänerna finns kvar i. Proteinerna p53 och MDM2 återupplivades sedan i ryggradsdjur. Resultaten visade att bindningsstyrkan för förfadern till fiskar och landryggradsdjur är relativt stark och att i fågel och människa är bindningsstyrkan ännu starkare. Dock har interaktionen i fisk försvagats och binder 15 gånger sämre än för-fadern till fiskar och landryggradsdjur. Resultaten ovan kan tyda på att p53 och MDM2 i fiskar inte har en lika viktig uppgift som i andra ryggradsdjur eller att denna uppgift utförs av andra proteiner i fiskar. Utöver interaktionen mellan p53 och MDM2 genomfördes även en studie på hur de två besläktade mänskliga proteinerna p63 och p73 binder till MDM2. Resultaten visade att p53 och p73 binder ungefär lika starkt till MDM2 men att p63 har betydligt sämre bindningsstyrka. De olika bindningsstyrkorna är ett exempel på hur kopior av gener kan utvecklas och få nya funktioner. I jämförelse med struk-turerade proteiner, består ostrukturerade proteiner i mycket större utsträck-ning av aminosyror som har positiv och negativ laddning. På grund utav dessa laddningar tror man att ostrukturerade proteiner kan förändra sin bind-ningsstyrka beroende på mängden laddade partiklar, joner, som finns i mil-jön runtomkring. Vi genomförde därför en studie för att undersöka bind-ningsstyrkan i lösningar med olika jonstyrka. Mina resultat indikerar dock att bindningsstyrkan inte förändrades särskilt mycket för varken p53, p63, och p73 i interaktionen med MDM2. Detta innebär att det är viktigt att undersöka varje interaktion som involverar ostrukturerade proteiner och inte göra för stora generaliseringar för en grupp av proteiner.

Page 38: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

38

Acknowledgements

In this section I would like to express my gratitude with a rhyme, to those who facilitated the work in this thesis and my conversion to a doctor, just like an enzyme. First and foremost, to my main supervisor Per Jemth: Thanks for making me a part, of a group that have biochemistry in their heart. I’ve been given a good mixture of trust, support and outsmart, in order to improve my skills in the protein science art. It’s soon time for me to de-part, with a lot of new knowledge in my cart. To my second supervisor Greta Hultqvist: You taught me all about phylogenetic trees, and helped me gain bioinformatic expertise, good luck with your research on Alzheimer’s dis-ease. To my second supervisor Maria Selmer and my examiners Pernilla Bjerling and Anna-Karin Olsson: It was always nice to know, I could come talk to you if I wanted advice or if someone stepped on my toe, thanks for being around for a friendly hello. To Eva Andersson: You have helped me more than twice, thanks for being super nice, you are as cool as ice. To co-authors of papers: Working with you has been a true delight, I hope your future is bright and you continue writing papers I can cite. To all the people at the IMBIM department I met, both you with and without a pipette. My time in teaching must be addressed, I had loads of fun working with you to aid students on their quest. The administration does nothing but impress, what you do is far better than a beta test. To the iPha association and all the rest, for sharing a beer and telling a good jest. I must have been blessed, because my time at IMBIM has been the best. To former and present mem-bers of the lab, without you my time as a PhD student wouldn’t been as fab. You have always given me a happy face, which been helpful after a bad stopped flow trace. To all my old and new friends: Hanging out with you is as great as space, just like playing card with a handful of ace. I like you guys very much, hope we will keep staying in touch. To Erika Melin: For being awesome and a perk, in the making of the cover artwork. To my family: You have always given the support I need, with you by my side I cannot do any-thing but succeed. Extra thanks to my sister for the proofread, I can always count on you guaranteed. A last big hug to all who contributed to my new degree, this would not have been possible without you and me.

Page 39: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

39

References

1. Ohno S (1970) Evolution by Gene Duplication (Springer, Berlin). 2. Lynch M, Conery JS (2003) The evolutionary demography of duplicate genes.

J Struct Funct Genomics 3(1–4):35–44. 3. Force A, et al. (1999) Preservation of duplicate genes by complementary,

degenerative mutations. Genetics 151(4):1531–1545. 4. Rastogi S, Liberles DA (2005) Subfunctionalization of duplicated genes as a

transition state to neofunctionalization. BMC Evol Biol 5(28). 5. Del Pozo JC, Ramirez-Parra E (2015) Whole genome duplications in plants: An

overview from Arabidopsis. J Exp Bot 66(22):6991–7003. 6. McLysaght A, Hokamp K, Wolfe KH (2002) Extensive genomic duplication

during early chordate evolution. Nat Genet 31(2):200–204. 7. Dehal P, Boore JL (2005) Two rounds of whole genome duplication in the

ancestral vertebrate. PLoS Biol 3(10):e314. 8. Nakatani Y, Takeda H, Kohara Y, Morishita S (2007) Reconstruction of the

vertebrate ancestral genome reveals dynamic genome reorganization in early vertebrates. Genome Res 17(9):1254–1265.

9. Kuraku S, Meyer A, Kuratani S (2009) Timing of genome duplications relative to the origin of the vertebrates: Did cyclostomes diverge before or after? Mol Biol Evol 26(1):47–59.

10. Kuraku S (2013) Impact of asymmetric gene repertoire between cyclostomes and gnathostomes. Semin Cell Dev Biol 24(2):119–127.

11. Smith JJ, Keinath MC (2015) The sea lamprey meiotic map improves resolution of ancient vertebrate genome duplications. Genome Res 25(8):1081–1090.

12. Jaillon O, et al. (2004) Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype. Nature 431(7011):946–957.

13. Meyer A, Van De Peer Y (2005) From 2R to 3R: Evidence for a fish-specific genome duplication (FSGD). BioEssays 27(9):937–945.

14. Otto SP (2007) The evolutionary consequences of polyploidy. Cell 131(3):452–462.

15. Hughes T, Liberles DA (2008) Whole-genome duplications in the ancestral vertebrate are detectable in the distribution of gene family sizes of tetrapod species. J Mol Evol 67(4):343–357.

16. Van De Peer Y, Maere S, Meyer A (2009) The evolutionary significance of ancient genome duplications. Nat Rev Genet 10(10):725–732.

17. Cañestro C, Albalat R, Irimia M, Garcia-Fernàndez J (2013) Impact of gene gains, losses and duplication modes on the origin and diversification of vertebrates. Semin Cell Dev Biol 24(2):83–94.

18. Wright PE, Dyson HJ (1999) Intrinsically unstructured proteins: Re-assessing the protein structure-function paradigm. J Mol Biol 293(2):321–331.

Page 40: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

40

19. Uversky VN, Gillespie JR, Fink AL (2000) Why are “natively unfolded” proteins unstructured under physiologic conditions? Proteins Struct Funct Genet 41(3):415–427.

20. Williams RM, et al. (2001) The protein non-folding problem: Amino acid determinants of intrinsic order and disorder. Pacific Symp Biocomput 6:89–100.

21. Flock T, Weatheritt RJ, Latysheva NS, Babu MM (2014) Controlling entropy to tune the functions of intrinsically disordered regions. Curr Opin Struct Biol 26(1):62–72.

22. Tompa P, Davey NE, Gibson TJ, Babu MM (2014) A Million peptide motifs for the molecular biologist. Mol Cell 55(2):161–169.

23. Wright PE, Dyson HJ (2009) Linking folding and binding. Curr Opin Struct Biol 19(1):31–38.

24. Hsu WL, et al. (2013) Exploring the binding diversity of intrinsically disordered proteins involved in one-to-many binding. Protein Sci 22(3):258–273.

25. Iakoucheva LM, Brown CJ, Lawson JD, Obradović Z, Dunker AK (2002) Intrinsic disorder in cell-signaling and cancer-associated proteins. J Mol Biol 323(3):573–584.

26. Staby L, et al. (2017) Eukaryotic transcription factors: Paradigms of protein intrinsic disorder. Biochem J 474(15):2509–2532.

27. Iakoucheva LM, et al. (2004) The importance of intrinsic disorder for protein phosphorylation. Nucleic Acids Res 32(3):1037–1049.

28. Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT (2004) Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol 337(3):635–645.

29. Pancsa R, Tompa P (2012) Structural disorder in eukaryotes. PLoS One 7(4):e34687.

30. Oates ME, et al. (2013) D2P2: Database of disordered protein predictions. Nucleic Acids Res 41(D1):508–516.

31. Romero PR, et al. (2006) Alternative splicing in concert with protein intrinsic disorder enables increased functional diversity in multicellular organisms. Proc Natl Acad Sci USA 103(22):8390–8395.

32. Light S, Sagit R, Sachenkova O, Ekman D, Elofsson A (2013) Protein expansion is primarily due to indels in intrinsically disordered regions. Mol Biol Evol 30(12):2645–2653.

33. Buljan M, Frankish A, Bateman A (2010) Quantifying the mechanisms of domain gain in animal proteins. Genome Biol 11(7):R74.

34. Montanari F, Shields DC, Khaldi N (2011) Differences in the number of intrinsically disordered regions between yeast duplicated proteins, and their relationship with functional divergence. PLoS One 6(9):e24989.

35. Amoutzias GD, et al. (2010) Posttranslational regulation impacts the fate of duplicated genes. Proc Natl Acad Sci USA 107(7):2967–2971.

36. Vavouri T, Semple JI, Garcia-Verdugo R, Lehner B (2009) Intrinsic protein disorder and interaction promiscuity are widely associated with dosage sensitivity. Cell 138(1):198–208.

37. Moesa HA, Wakabayashi S, Nakai K, Patil A (2012) Chemical composition is maintained in poorly conserved intrinsically disordered regions and suggests a means for their classification. Mol Biosyst 8(12):3262.

38. Brown CJ, et al. (2002) Evolutionary rate heterogeneity in proteins with long disordered regions. J Mol Evol 55(1):104–110.

Page 41: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

41

39. Chen JW, Romero P, Uversky VN, Keith A (2006) Conservation of intrinsic disorder in protein domains and families: II. Functions of conserved disorder. J Proteome Res 5(4):888–898.

40. Brown CJ, Johnson AK, Daughdrill GW (2010) Comparing Models of Evolution for Ordered and Disordered Proteins. Mol Biol Evol 27(3):609–621.

41. Bellay J, et al. (2011) Bringing order to protein disorder through comparative genomics and genetic interactions. Genome Biol 12(2):R14.

42. Ahrens J, Dos Santos HG, Siltberg-Liberles J (2016) The nuanced interplay of intrinsic disorder and other structural properties driving protein evolution. Mol Biol Evol 33(9):2248–2256.

43. Schlessinger A, et al. (2011) Protein disorder - a breakthrough invention of evolution? Curr Opin Struct Biol 21(3):412–8.

44. Nguyen Ba AN, et al. (2012) Proteome-wide discovery of evolutionary conserved sequences in disordered regions. Sci Signal 5(215):rs1.

45. Bedford DC, Kasper LH, Fukuyama T, Brindle PK (2010) Target gene context influences the transcriptional requirement for the KAT3 family of CBP and p300 histone acetyltransferases. Epigenetics 5(1):9.15.

46. Yao TP, et al. (1998) Gene dosage-dependent embryonic development and proliferation defects in mice lacking the transcriptional integrator p300. Cell 93(3):361–372.

47. Tanaka Y, et al. (2000) Extensive brain hemorrhage and embryonic lethality in a mouse null mutant of CREB-binding protein. Mech Dev 95(1–2):133–145.

48. Kamei Y, et al. (1996) A CBP integrator complex mediates transcriptional activation and AP-1 inhibition by nuclear receptors. Cell 85(3):403–414.

49. Ogryzko V V., Schiltz RL, Russanova V, Howard BH, Nakatani Y (1996) The transcriptional coactivators p300 and CBP are histone acetyltransferases. Cell 87(5):953–959.

50. Chen H, et al. (1997) Nuclear receptor coactivator ACTR is a novel histone acetyltransferase and forms a multimeric activation complex with P/CAF and CBP/p300. Cell 90(3):569–580.

51. Kjaergaard M, Teilum K, Poulsen FM (2010) Conformational selection in the molten globule state of the nuclear coactivator binding domain of CBP. Proc Natl Acad Sci U S A 107(28):12535–40.

52. Dogan J, Schmidt T, Mu X, Engström Å, Jemth P (2012) Fast association and slow transitions in the interaction between two intrinsically disordered protein domains. J Biol Chem 287(41):34316–24.

53. Waters L, et al. (2006) Structural diversity in p160/CREB-binding protein coactivator complexes. J Biol Chem 281(21):14787–95.

54. Demarest SJ, et al. (2002) Mutual synergistic folding in recruitment of CBP/p300 by p160 nuclear receptor coactivators. Nature 415(6871):549–553.

55. Heery DM, Kalkhoven E, Hoare S, Parker MG (1997) A signature motif in transcriptional co-activators mediates binding to nuclear receptors. Nature 387(6634):733–736.

56. Iešmantavičius V, Dogan J, Jemth P, Teilum K, Kjaergaard M (2014) Helical propensity in an intrinsically disordered protein accelerates ligand binding. Angew Chemie 53(6):1548–1551.

57. Farmer G, et al. (1992) Wild-type p53 activates transcription in vitro. Nature 358(6381):83–86.

58. Sullivan KD, Galbraith MD, Andrysik Z, Espinosa JM (2018) Mechanisms of transcriptional regulation by p53. Cell Death Differ 25(1):133–143.

Page 42: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

42

59. Momand J, Wu H-H, Dasgupta G (2000) MDM2 - Master regulator of the p53 tumor suppressor protein. Gene 242(1):15–29.

60. Brooks CL, Gu W (2006) p53 ubiquitination: Mdm2 and beyond. Mol Cell 21(3):307–315.

61. Pei D, Zhang Y, Zheng J (2012) Regulation of p53: A collaboration between Mdm2 and MdmX. Oncotarget 3(3):228–235.

62. Bunz F, et al. (1998) Requirement for p53 and p21 to sustain G2 arrest after DNA damage. Science 282(5393):1497–1501.

63. Chen J (2016) The cell-cycle arrest and apoptotic functions of p53 in tumor initiation and progression. Cold Spring Harb Perspect Med 6(3):a026104.

64. Aubrey BJ, Kelly GL, Janic A, Herold MJ, Strasser A (2017) How does p53 induce apoptosis and how does this relate to p53-mediated tumour suppression? Cell Death Differ 25(1):104–113.

65. Vogelstein B, Lane D, Levine AJ (2000) Surfing the p53 network. Nature 408(6810):307–310.

66. Kandoth C, et al. (2013) Mutational landscape and significance across 12 major cancer types. Nature 502(7471):333–339.

67. Kussie PH, et al. (1996) Structure of the MDM2 oncoprotein bound to the p53 tumor suppressor transactivation domain. Science 274(5289):948–53.

68. Schon O, Friedler A, Bycroft M, Freund SM., Fersht AR (2002) Molecular mechanism of the interaction between MDM2 and p53. J Mol Biol 323(3):491–501.

69. Showalter SA, Bruschweiler-Li L, Johnson E, Zhang F, Brüschweiler R (2008) Quantitative lid dynamics of MDM2 reveals differential ligand binding modes of the p53 binding cleft. J Am Chem Soc 130(20):6472–6478.

70. Lum JK, Neuweiler H, Fersht AR (2012) Long-range modulation of chain motions within the intrinsically disordered transactivation domain of tumor suppressor p53. J Am Chem Soc 134(3):1617–1622.

71. Li C, et al. (2010) Systematic mutational analysis of peptide inhibition of the p53-MDM2/MDMX interactions. J Mol Biol 398(2):200–13.

72. Borcherds W, et al. (2014) Disorder and residual helicity alter p53-Mdm2 binding affinity and signaling in cells. Nat Chem Biol 14(4):1000–1002.

73. Crabtree MD, et al. (2017) Conserved helix-flanking prolines modulate intrinsically disordered protein:Target affinity by altering the lifetime of the bound complex. Biochemistry 56(18):2379–2384.

74. Ferreon JC, et al. (2009) Cooperative regulation of p53 by modulation of ternary complex formation with CBP/p300 and HDM2. Proc Natl Acad Sci USA 106(16):6591–6.

75. Flicek P, et al. (2013) Ensembl 2013. Nucleic Acids Res 41(Database issue):D48-55.

76. Vilella AJ, et al. (2009) EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. Genome Res 19(2):327–35.

77. Kersey PJ, et al. (2012) Ensembl Genomes: An integrative resource for genome-scale data from non-vertebrate species. Nucleic Acids Res 40(D1):91–97.

78. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410.

79. Cameron RA, Samanta M, Yuan A, He D, Davidson E (2009) SpBase: The sea urchin genome database and web site. Nucleic Acids Res 37(Database issue):D750-4.

80. Wyffels J, et al. (2014) SkateBase, an elasmobranch genome project and collection of molecular resources for chondrichthyan fishes. F1000 Res 3:191.

Page 43: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

43

81. Mehta TK, et al. (2013) Evidence for at least six Hox clusters in the Japanese lamprey (Lethenteron japonicum). Proc Natl Acad Sci USA 110(40):16044–9.

82. Penn O, et al. (2010) GUIDANCE: A web server for assessing alignment confidence scores. Nucleic Acids Res 38(Web Server issue):W23–W28.

83. Sela I, Ashkenazy H, Katoh K, Pupko T (2015) GUIDANCE2: Accurate detection of unreliable alignment regions accounting for the uncertainty of multiple parameters. Nucleic Acids Res 43(Web server issue):W7–W14.

84. Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol Biol Evol 30(4):772–780.

85. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464. 86. Tamura K, Stecher G, Peterson D, Filipski A, Kumar S (2013) MEGA6:

Molecular evolutionary genetics analysis version 6.0. Mol Biol Evol 30(12):2725–2729.

87. Guindon S, et al. (2010) New algorithms and mehtods to estimate maximum-likelihood phylogenies: Asessing the performance of PhyML 2.0. Syst Biol 59(3):307–321.

88. Anisimova M, Gascuel O (2006) Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative. Syst Biol 55(4):539–552.

89. Pauling L, Zuckerkandl E, Henriksen T, Lövstad R (1963) Chemical paleogenetics. Molecular “restoration studies” of extinct forms of life. Acta Chem Scand 17:S9-16.

90. Stackhouse J, Presnell SR, McGeehan GM, Nambiar KP, Benner SA (1990) The ribonuclease from an ancient bovid ruminant. FEBS Lett 262(1):104–106.

91. Malcom BA, Wilson KP, Matthews BW, Kirsch JF, Wilson AC (1990) Ancestral lysozymes reconstructed, neutrality tested, and thermostability linked to hydrocarbon packing. Nature 345(6517):86–89.

92. Jermann TM, Opitz JG, Stackhouse J, Benner SA (1995) Reconstructing the evolutionary history of the artiodactyl ribonuclease superfamily. Nature 374(6517):57–59.

93. Chang J-M, Tommaso P Di, Notredame C (2014) TCS: A new multiple sequence alignment reliability measure to estimate alignment accuracy and improve phylogenetic tree reconstruction. Mol Biol Evol 31(6):1625–1637.

94. Yang Z, Kumar S, Nei M (1995) A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141(4):1641–1650.

95. Malatesta F (2005) The study of bimolecular reactions under non-pseudo-first order conditions. Biophys Chem 116(3):251–256.

96. Teufel DP, Freund SM, Bycroft M, Fersht AR (2007) Four domains of p300 each bind tightly to a sequence spanning both transactivation subdomains of p53. Proc Natl Acad Sci USA 104(17):7009–14.

97. Zdzalik M, et al. (2010) Interaction of regulators Mdm2 and Mdmx with transcription factors p53, p63 and p73. Cell Cycle 9(22):4584–91.

98. Gomes Dos Santos H, Nunez-Castilla J, Siltberg-Liberles J (2016) Functional diversification after gene duplication: Paralog specific regions of structural disorder and phosphorylation in p53, p63, and p73. PLoS One 11(3):e:0151961.

99. Risso VA, Gavira JA, Mejia-Carmona DF, Gaucher EA, Sanchez-Ruiz JM (2013) Hyperstability and substrate promiscuity in laboratory resurrections of Precambrian β-lactamases. J Am Chem Soc 135(8):2899–902.

100.Finnigan GC, Hanson-Smith V, Stevens TH, Thornton JW (2012) Evolution of increased complexity in a molecular machine. Nature 481(7381):360–4.

Page 44: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

44

101.Xu X, Zhou ZH, Prum RO (2001) Branched integumental structures in Sinornithosaurus and the origin of feathers. Nature 410(6825):200–204.

102.Thisse C, Neel H, Thisse B, Daujat S, Piette J (2000) The Mdm2 gene of zebrafish (Danio rerio): Preferential expression during development of neural and muscular tissues, and absence of tumor formation after overexpression of its cDNA during early embryogenesis. Differentiation 66(2–3):61–70.

103.Storer NY, Zon LI (2010) Zebrafish models of p53 functions. Cold Spring Harb Perspect Biol 2(8):a001123.

Page 45: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan
Page 46: Evolution of Interactions Involving Intrinsically ...uu.diva-portal.org/smash/get/diva2:1178141/FULLTEXT01.pdf · I , Hultqvist G Åberg E, Camilloni C, Sundell GN, Andersson E, Dogan

Acta Universitatis UpsaliensisDigital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Medicine 1424

Editor: The Dean of the Faculty of Medicine

A doctoral dissertation from the Faculty of Medicine, UppsalaUniversity, is usually a summary of a number of papers. A fewcopies of the complete dissertation are kept at major Swedishresearch libraries, while the summary alone is distributedinternationally through the series Digital ComprehensiveSummaries of Uppsala Dissertations from the Faculty ofMedicine. (Prior to January, 2005, the series was publishedunder the title “Comprehensive Summaries of UppsalaDissertations from the Faculty of Medicine”.)

Distribution: publications.uu.seurn:nbn:se:uu:diva-336083

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2018