Reconstructing the Evolutionary History of MCPH genes and...

152
Reconstructing the Evolutionary History of MCPH genes and its Implications in Human Brain Size and Intelligence By Nashaiman Pervaiz National Center for Bioinformatics Faculty of Biological Sciences Quaid-i-Azam University Islamabad, Pakistan 2019

Transcript of Reconstructing the Evolutionary History of MCPH genes and...

Page 1: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Reconstructing the Evolutionary History of MCPH genes

and its Implications in Human Brain Size and

Intelligence

By

Nashaiman Pervaiz

National Center for Bioinformatics

Faculty of Biological Sciences

Quaid-i-Azam University

Islamabad, Pakistan

2019

Page 2: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Reconstructing the Evolutionary History of MCPH genes

and its Implications in Human Brain Size and

Intelligence

By

Nashaiman Pervaiz

A thesis submitted in the partial fulfillment of

the requirements for the degree of

DOCTOR OF PHILOSOPHY

IN

BIOINFORMATICS

National Center for Bioinformatics

Faculty of Biological Sciences

Quaid-i-Azam University

Islamabad, Pakistan

2019

Page 3: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,
Page 4: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,
Page 5: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,
Page 6: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,
Page 7: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Acknowledgement

Acknowledgements

Millionth gratitude to Allah Almighty, the most beneficent and the most merciful, who

bestowed me with the potential to seek the knowledge and to explore some of the

many aspects of his creation. Countless blessings and clemencies of Allah may be

upon our Holy Prophet Hazrat Muhammad (P.B.U.H), the fortune of knowledge, who

took the humanity out of the abyss of ignorance and elevated it to the zenith of

consciousness.

With deep regards and profound respect, I owe this opportunity to express my deep

sense of gratitude and indebtedness to my supervisor Dr. Amir Ali Abbasi for his

inspiring guidance, encouragement, and valuable suggestions throughout the research

work. Without his continuous support and assistance it would not have been possible

to finish this dissertation. I am extremely grateful to him for giving me his precious

time. I am also thankful to all faculty members of National Center for Bioinformatics

for their sincere and kind attitude, guidance and cooperation during the period of my

study at this university.

I would like my sincere thanks to my all colleagues at Comparative and Evolutionary

Genomics (CEG) lab, in particular Rabail Zehra, Shahid Ali, Dr. Rashid Minhas, Irfan

Hussain and Fatima Batool for their conducive discussion, utmost cooperation,

valuable memory and providing a peaceful environment, during my stay in CEG lab. I

shall be failing in my duty if I do not put across my thanks and gratitude to my junior

lab fellow Anabia Sohail, Irum Javaid Siddiqui and Noor us Sehar for their

cooperation and memorable company. It was a pleasure and honour to work with

Shahid Ali, Anabia Sohail, Irum Javaid Siddiqui and Dr. Rashid Minhas.

I should like to thanks entire staff of the National Center for Bioinformatics, in

particular Mr. Talib Hussain, Mr. Yasir Abbasi, Mr. Ali, Mr. M.Naseer, Mr. Masood

and Mr. Naseer Ahmed Raja for their kind cooperation.

I am also thankful to my roommates Anam Murad and Waheeda Rana for being the

most supportive and caring and also for their refreshing discussion on various topics.

This journey would not have been possible without the support of my family. I am

especially indebted to two most precious and substantially unique hominins of this

Page 8: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Acknowledgement

universe, my father Rana Pervaiz Akhter Khan and my mother Anjum pervaiz who

taught and supported me throughout my education and giving me liberty to choose

what I desired. I have no words to express my immense affection for my parents but I

salute you all for the selfless love, care, pain and sacrifice you did to shape my life. I

would like to thank my mentor, my adorable brother Adnan Pervaiz who inspired me

in my childhood and till date for his hard work, dedication and optimistic approach to

accomplish any project. He always reinforced me whenever I am down and pushes me

to fly on the sky. Many thanks to my best friend, my delectable brother Rana Qaisar

Pervaiz who always stands with me in every decision I took and unconditionally

supported me. He believed in me more than I believe in myself. I am greatly indebted

to my elder sister Shabnam Pervaiz who has kept me (mostly) sane through her

critiques. Her encouragement, support and unwavering faith in my ability to muddle

through have been a great help. My most fervent thanks to two homininae my beloved

younger sisters, Hina Pervaiz and Nida Pervaiz, who through long phone conversation

at a crucial times, helped me find the strength to continue my work and helped me

come to the decision, once and for all, that obtaining this degree would not be the first

challenge in my life that I would not rise to meet. No words can thank them enough for

their contribution in my life.

In the end I want to present my unbending thanks to all those hands who prayed for my

betterment and serenity.

Nashaiman Pervaiz

Page 9: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Contents

Contents

List of Figures ............................................................................................................. i

List of Tables.............................................................................................................. ii

List of Abbreviations ................................................................................................. iv

Summary .................................................................................................................. vii

Introduction ................................................................................................................ 1

1.1 Human Brain Evolution .................................................................................... 2

1.1.1 Human brain regions evolved during Pliocene-Pleistocene epochs ............. 4

1.2 Autosomal recessive primary microcephaly (MCPH) ........................................ 5

1.3 Primary microcephaly genes and their functions................................................ 7

1.3.1 Microcephalin ............................................................................................ 8

1.3.2 WD repeat domain 62 (WDR62) ................................................................ 8

1.3.3 Cyclin-dependent kinase 5 regulatory associated protein 2 (CDK5RAP2) .. 9

1.3.4 Kinetochore scaffold 1 (KNL1) ................................................................ 11

1.3.5 Abnormal spindle-like microcephaly associated gene (ASPM) ................. 12

1.3.6 Centrosomal associated protein J (CENPJ) ............................................... 13

1.3.7 SCL/TAL1 interrupting locus (STIL) ....................................................... 14

1.3.8 Centrosomal protein 135 (CEP135) .......................................................... 15

1.3.9 Centrosomal protein 152 (CEP152) .......................................................... 16

1.3.10 Zinc finger protein 335 (ZNF335) .......................................................... 16

1.3.11 Polyhomeotic homolog 1(PHC1) ............................................................ 17

1.3.12 Cyclin-dependent kinase 6 (CDK6) ........................................................ 18

1.3.13 SAS-6 centriolar assembly protein (SASS6) ........................................... 18

1.3.14 Major facilitator superfamily domain containing 2A (MFSD2A) ............ 19

1.3.15 Citron rho-interacting serine/threonine kinase (CIT) ............................... 20

1.3.16 Kinesin family member 14 (KIF14) ........................................................ 21

1.4 The cost of human brain size enlargement ....................................................... 22

1.5 Parkinson‘s disease ......................................................................................... 22

1.5.1 Alpha synuclein ....................................................................................... 23

1.6 Archaic human genomes ................................................................................. 24

1.7 Aims & approach of study............................................................................... 24

Materials and Methods ............................................................................................. 26

2.1 Dataset for genes linked with autosomal recessive primary microcephaly ....... 26

2.2 Sequence Alignment ....................................................................................... 26

2.3 Phylogenetic tree reconstruction methods........................................................ 27

2.3.1 Phylogenetic analysis by neighbor Joining (NJ) method ........................... 27

Page 10: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Contents

2.3.2 Phylogenetic analysis by Maximum likelihood (ML) method ................... 28

2.4 Ancestral state reconstruction .......................................................................... 28

2.5 Analysis of molecular macroevolution ............................................................ 29

2.5.1 Estimation of selective pressure on MCPH protein coding genes .............. 29

2.5.2 Codon substitutions site models................................................................ 30

2.5.3 Codon substitutions Branch-site model ..................................................... 31

2.5.4 Clade model C (CmC) analyses ................................................................ 32

2.6 Statiscal Analysis ............................................................................................ 33

2.7 Detecting selection at microevolutionary level ................................................ 33

2.7.1 Sequence acquisiotn of human population data ......................................... 33

2.7.2 Frequency spectrum based method for natural selection ........................... 34

2.8 Molecular evolution of synuclein genes .......................................................... 34

2.8.1 Sequence and structure analysis of synuclein genes .................................. 34

2.8.2 Estimation of functional divergence among synuclein genes..................... 35

2.8.3 Identification of coevolutionary relationship among residues within gene. 35

Results ...................................................................................................................... 37

3.1 Identification of candidate genes ..................................................................... 37

3.2 WD repeat domain 62 (WDR62) ..................................................................... 37

3.2.1 Evolutionary history of MCPH2 gene WDR62 ......................................... 37

3.2.2 Molecular evolution of WDR62 in mammals ............................................ 39

3.2.3 Human polymorphisms and signatures of selection................................... 40

3.2.4 SWAKK analysis of WDR62 ................................................................... 40

3.2.5 Comparative analysis of WDR62 with archaic humans and modern human

populations ....................................................................................................... 42

3.3 SCL/TAL1 interrupting locus (STIL) .............................................................. 44

3.3.1 Evolutionary history of STIL .................................................................... 44

3.3.2 Molecular Evolution of STIL in Mammals by Site Models ....................... 46

3.3.3 Episodic selection at various stages of primate evolution in STIL locus .... 47

3.3.4 Divergent selection pressure between clades of mammals for STIL locus . 48

3.3.5 Human polymorphisms and signatures of selection................................... 49

3.3.6 SWAKK analysis of STIL ........................................................................ 50

3.3.7 Comparative analysis of STIL with archaic humans and modern human

populations ....................................................................................................... 52

3.4 Centrosomal Protein 135 (CEP135)................................................................. 53

3.4.1 Evolutionary history of CEP135 ............................................................... 53

3.4.2 Estimation of pervasive signals of positive selection in CEP135 during

placental mammals ........................................................................................... 55

3.4.3 Signature of positive selection by branch-site model................................. 55

Page 11: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Contents

3.4.4 Divergent selective pressure across CEP135 mammalian phylogeny ........ 57

3.5 Zinc finger protein 335 (ZNF335) ................................................................... 58

3.5.1 Evolutionary history of ZNF335 ............................................................... 58

3.5.2 Molecular evolution of ZNF335 in mammals by site models .................... 58

3.5.3 Signatures of episodic positive selection at various evolutionary stages from

ancestral primate to human terminal branch ...................................................... 61

3.5.4 Divergent selection pressure between different partitions of mammalian

phylogeny ......................................................................................................... 61

3.6 Polyhomeotic homolog 1 (PHC1).................................................................... 62

3.6.1 Phylogenetic analysis of PHC1 ................................................................. 62

3.6.2 Molecular evolution of PHC1 by site models ............................................ 64

3.6.3 Episodic Selection at PHC1 mammalian phylogeny ................................. 64

3.6.4 Divergent selective constraints across PHC1 mammalian phylogeny ........ 66

3.7 Cyclin Dependent Kinase 6 (CDK6) ............................................................... 67

3.7.1 Phylogenetic analysis of CDK6 ................................................................ 67

3.7.2 Molecular evolution of CDK6 by site model ............................................ 67

3.7.3 Episodic positive selection on CDK6 phylogeny ...................................... 68

3.7.4 Divergent selective constraint across CDK6 mammalian phylogeny ......... 71

3.8 SAS-6 centriolar assembly protein (SASS6) .................................................... 71

3.8.1 Evolutionary history of SASS6 ................................................................. 71

3.8.2 Molecular Evolution of SASS6 in mammals by site models ..................... 73

3.8.3 Signature of episodic positive selection at SASS6 mammalian phylogeny 74

3.8.4 Divergent selective constraints between partitions of SASS6 mammalian

phylogeny ......................................................................................................... 75

3.9 Major Facilitator Superfamily Domain Containing 2A (MFSD2A) ................. 76

3.9.1 Phylogenetic analysis of MFSD2A ........................................................... 76

3.9.2 Pervasive adaptive evolution of MFSD2A in placental mammals ............. 78

3.9.3 Episodic adaptive evolution across the MFSD2A mammalian Phylogeny . 78

3.9.4 Divergent selective constraint across the MFSD2A mammalian phylogeny

......................................................................................................................... 80

3.10 Citron rho-interacting serine/threonine kinase (CIT)...................................... 81

3.10.1 Evolutionary history of CIT gene ........................................................... 81

3.10.2 Molecular evolution of CIT across eutherian .......................................... 81

3.10.3 Molecular evolution of CIT protein coding gene by branch-site model ... 83

3.10.4 Divergent selective pressure across CIT mammalian phylogeny ............. 84

3.11 Kinesin Family Member 14 (KIF14) ............................................................. 85

3.11.1 Evolutionary history of KIF14 ................................................................ 85

3.11.2 Pervasive adaptive evolution in KIF14 across eutherian mammals.......... 85

Page 12: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Contents

3.11.3 Episodic positive selection across KIF14 mammalian phylogeny............ 87

3.11.4 Site-specific functional divergence among the partitions of KIF14

mammalian phylogeny ...................................................................................... 88

3.12 Synuclein gene family ................................................................................... 89

3.12.1 Evolutionary history of synuclein family ................................................ 89

3.12.2 Sequence evolution and Coevolutionary relationship .............................. 91

3.12.3 Structural evolution of α synuclein ......................................................... 95

3.12.4 Divergent selective constraint among synuclein genes ............................ 98

Discussion ...............................................................................................................101

Conclusion and future prospects ..............................................................................113

References ...............................................................................................................114

Page 13: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

List of Figures

i

List of Figures

Figure 1. 1: Comparative brain size of extant primates. ............................................... 3

Figure 1. 2: Endocrinal differences in genus homo during Pliocene-Pleistocene. ......... 5

Figure 1. 3: Comparative view of normal and microcephalic brain. ............................. 6

Figure 1. 4: MCPH genes from different pathway to cause microcephaly.................... 7

Figure 1. 5: The role of WDR62 during neocorticogenesis. ....................................... 10

Figure 1. 6: Circular illustration of Parkinson‘s disease associated genes on human

chromosomes. ........................................................................................................... 23

Figure 2. 1: Phylogenetic tree of 48 placental mammal genomes. ............................. 30

Figure 3. 1: Evolutionary history of MCPH2 gene WDR62....................................... 38 Figure 3. 2: Estimation of WDR62 sequence evolution in therian. .......................... 39

Figure 3. 3: SWAKK plot of human and chimpanzee WDR62. ............................... 41 Figure 3. 4: Comparative analysis of WDR62 among human populations. ............... 43

Figure 3. 5: Phylogenetic analysis of STIL gene. ...................................................... 45 Figure 3. 6: Sliding window analysis of STIL. .......................................................... 51

Figure 3. 7: Evolutionary history of MCPH8 gene CEP135 ...................................... 54 Figure 3. 8: Phylogenetic tree of MCPH10 gene ZNF335 using NJ approach ............ 59

Figure 3. 9: Phylogenetic tree of human PHC1 and its putative paralogs. .................. 63 Figure 3. 10: Evolutionary history of MCPH12 gene CDK6. .................................... 69

Figure 3. 11: Evolutionary history of human SASS6 gene. ........................................ 72 Figure 3. 12: Phylogenetic tree of MCPH15 gene MFSD2A gene ............................. 77

Figure 3. 13: Evolutionary history of human CIT gene.............................................. 82 Figure 3. 14: Evolutionary history of human KIF14 gene. ......................................... 86

Figure 3. 15: Evolutionary history of synuclein family. ............................................. 90 Figure 3. 16: Sequence alignment of human synuclein paralogs. ............................... 93

Figure 3. 17: Coevolutionary relationship within synuclein genes. ............................ 94 Figure 3. 18: Structural deviation among synuclein paralogs. .................................... 96

Figure 3. 19: Structural evolution of α synuclein protein. since the split from last

common sarcopterygian ancestor. ............................................................................. 97

Figure 3. 20: Structural analysis of mutant models of human α synuclein. ................. 97

Figure 4. 1: Human neocortical cell types. ...............................................................103

Figure 4. 2: Schematic overview of neurodegenerative (a, b,e) and neuroprotective

(c,d) role of alpha synuclein. ...................................................................................109

Page 14: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

List of Tables

ii

List of Tables

Table 1. 1: Comparative brain size of extinct and extant primates. .............................. 4

Table 3. 1: Amino acids substitutions in human and chimpanzee lineage since the

divergence from hominini ancestor. .......................................................................... 42 Table 3. 2: Parameter estimation and LRT for Mammals STIL. ................................ 47

Table 3. 3: Branch-site analysis of STIL. .................................................................. 48 Table 3. 4: Divergent selection constraint parameters estimation and likelihood scores

for STIL. .................................................................................................................. 49 Table 3. 5: Tests for departure from neutrality through population‘s variation data

(1000 genome).......................................................................................................... 50 Table 3. 6: Human and chimpanzee specific substitutions in STIL after the divergence

from hominini ancestor. ............................................................................................ 52 Table 3. 7: Selective pressure estimation and LRT for Mammals CEP135. ............... 56

Table 3. 8: Branch-site analysis of CEP135. ............................................................. 57 Table 3. 9: Divergent selection constraint parameters estimation and likelihood scores

for CEP135. .............................................................................................................. 57 Table 3. 10: Parameter estimation and LRT for Mammals ZNF335. ......................... 60

Table 3. 11: Branch-site analysis of ZNF335. .......................................................... 61 Table 3. 12: Divergent selection constraint parameters estimation and likelihood scores

for ZNF335. ............................................................................................................. 62 Table 3. 13: Parameter estimation and LRT for Mammals PHC1. ............................ 65

Table 3. 14: Branch-site analysis of PHC1. ............................................................... 66 Table 3. 15: Divergent selection constraint parameters estimation and likelihood

scores for PHC1........................................................................................................ 67 Table 3. 16: Parameter estimation and LRT for Mammals CDK6. ............................ 70

Table 3. 17: Branch-site analysis of CDK6. .............................................................. 70 Table 3. 18: Divergent selection constraint parameters estimation and likelihood scores

for CDK6. ................................................................................................................ 71 Table 3. 19: Parameter estimation and LRT for Mammals SASS6. ........................... 74

Table 3. 20: Branch-site analysis of SASS6. ............................................................. 75 Table 3. 21: Divergent selection constraint parameters estimation and likelihood scores

for SASS6. ............................................................................................................... 76 Table 3. 22: Parameter estimation and LRT for Mammals MFSD2A. ....................... 79

Table 3. 23: Branch-site analysis of MFSD2A. ........................................................ 80 Table 3. 24: Divergent selection constraint parameters estimation and likelihood scores

for MFSD2A. ........................................................................................................... 80 Table 3. 25: Parameter estimation and LRT for Mammals CIT. ............................... 83

Table 3. 26: Branch-site analysis of CIT. .................................................................. 84 Table 3. 27: Divergent selection constraint parameters estimation and likelihood

scores for CIT. .......................................................................................................... 85 Table 3. 28: Parameter estimation and LRT for Mammals KIF14. ............................ 87

Table 3. 29: Branch-site analysis of KIF14. .............................................................. 88 Table 3. 30: Divergent selection constraint parameters estimation and likelihood scores

for KIF14. ................................................................................................................ 89 Table 3. 31: Sites under negative selection constraint in α synuclein among vertebrates

alignment with SLAC analysis. ................................................................................. 95 Table 3. 32: Parameter estimation and likehood score for synuclein family to detect

functional divergence. .............................................................................................. 99

Page 15: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

List of Tables

iii

Table 3. 33: Statistical significance of functional divergence among synuclein family.

................................................................................................................................100

Table 3. 34: Type 1 functional divergence of synuclein family.................................100

Table 4. 1: Chimpanzee, hominin and human specific amino acids replacements in

MCPH genes since the divergence from hominini ancestor. .....................................106

Page 16: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

List of Abbreviations

iv

List of Abbreviations

MCPH Autosomal recessive Primary microcephaly

HC Head circumference

OFC Occipitofrontal circumference

SD Standard deviation

MRI Magnetic resonance imaging

MCPH1 Microcephalin

WDR62 WD repeat domain 62

CDK5RAP2 Cyclin-dependent kinase 5 regulatory associated protein 2

KNL1 Kinetochore scaffold 1

CASC5 Cancer susceptibility candidate 5

ASPM Abnormal spindle-like microcephaly associated gene

CENPJ Centrosomal associated protein J

STIL SCL/TAL1 interrupting locus

CEP135 Centrosomal protein 135

CEP152 Centrosomal protein 152

ZNF335 Zinc finger protein 335

PHC1 Polyhomeotic homolog 1

CDK6 Cyclin-dependent kinase 6

SASS6 SAS-6 centriolar assembly protein

MFSD2A Major facilitator superfamily domain containing 2A

CIT Citron rho-interacting serine/threonine kinase

KIF14 Kinesin family member 14

CINP CDK2 interacting protein

asl Asterless

IQ Isoleucine glutamine

DHA Docosahexanoic acid

BBB Blood brain barrier

BRCT Breast cancer1 carboxyl-terminal

RT-PCR Reverse transcriptase polymerase chain reaction

AD Alzheimer‘s disease

Page 17: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

List of Abbreviations

v

PD Parkinson‘s disease

ALS Amyotrophic lateral sclerosis

MS Multiple sclerosis

LBD Lewy bodies‘ disease

MSA Multiple System Atrophy

NCBI National Center for Biotechnology Information

NJ Neighbor Joining

ML Maximum likelihood

WAG Whelan And Goldman

JTT Jones, Taylor, and Thornton

ASR Ancestral sequence reconstructions

LRT Likelihood ratio test

BEB Bayes Empirical Bayes

NEB Naïve empirical Bayes

CmC Clade model C

GY94 Goldman and Yang 94

SLAC Single likelihood ancestor counting

DIVERGE DetectIng Variability in Evolutionary Rates among Genes

MISTIC Mutual Information Server To Infer Coevolution

MI Mutual information

cMI Cumulative mutual information

pMI Proximity mutual information

CHB Han Chinese in Beijing, China

CHS Han Chinese south China

JPT Japanese in Tokyo, Japan

MXL People with Mexican ancestry in Los Angeles

PUR Puetro Ricans in Puetro Rico

CLM Colombians in Medellin, Colombia

IBS Iberian population in Spain

TSI Toscani in Italia

CEU Uttah residents with ancestry from northern and western Europe

GBR British from England and Scotland UK

Page 18: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

List of Abbreviations

vi

FIN Finnish in Finland

ASW People with africans ancestry in southwest united states

LWK Luhya in webuyo, Kenya

YRI Yoruba in Ibadan Nigeria

SWAKK Sliding window analysis of Ka/Ks

CDS Coding sequences

SNCA α synuclein

SNCB β synuclein

SNCG γ synuclein

BP Basal progenitor

bRG Basal radial galia

VZ Ventricular zone

iSVZ Inner subventricular zone

oSVZ Outer ventricular zone

IZ Intermediate zone

CP Cortical plate

SRGAP2 SLIT-ROBO Rho GTPase activating protein 2

HARE5 Human accelerated region 5

ADCYAP1 Adenylate-cyclase-activating polypeptide 1

ROS Reactive oxygen specie

MPP+ 1-methyl-4-phenylpyridinium

PKCδ Protein Kinase C delta

HAT Histone acytyltransferase

Page 19: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Summary

vii

Summary

Background: The enlarged and globular brain is the most distinctive anatomical

feature in human evolution that set us apart from our extinct and extant modern human

relatives. In a petite evolutionary time the magnitude of human brain is three fold

expanded as compared to our closest living kin chimpanzee. Major episodes of human

brain size expansion occurred during the upper Pliocene to early Pleistocene era and

yet again in middle Pleistocene epochs. The exact genetic basis of these evolutionary

changes that bifurcate the highly cognitive human brain from supposedly lesser

cognitive nonhuman hominids brain still remain enigmatic. However, it is presumed

that complex and larger human brain emerged by essential changes in genes and non-

coding regulatory elements. One approach to comprehending the evolution of human

brain is to scrutinize the evolution of genes indispensable for normal brain

development. Although brain development is genetically complex process, genes

associated with early brain development are the best candidate genes in order to

understand the mechanism involved in the evolutionary expansion of human brain

size. Primary microcephaly genes were selected as their key role in early brain

development and mutations in these genes cause severe reduction in cerebral cortex

size that is most notably expanded during recent human history. The brain size of

microcephalic patients is similar with the size of Pan troglodyte brain and the very

early hominid the gracile australopithecine Australopithecus afarensis (average brain

size of Australopithecines is 450 cm3), suggesting that primary microcephaly genes

likely to have been evolutionary targets in the enlargement of human brain evolution.

In this study, the implications of primary microcephaly genes in the evolutionary

enlargement of human brain size has been explored by executing a comprehensive

evolutionary analysis on ten newly identified microcephaly genes (WDR62, STIL,

CEP135, ZNF335, PHC1, CDK6, SASS6, MFSD2A, CIT, and KIF14) across 48

euthrian species. Subsequently also try to explored what are the mechanisms that

associate the evolutionary expansion of human brain size with Parkinson‘s disease by

studying the molecular evolution of Parkinson‘s disorder linked alpha synuclein gene.

Results: By employing codon substitutions site models based on maximum

likelihood method, signatures of pervasive positive selection were identified in five

MCPH genes (KIF14, ZNF335, SASS6, CIT and KIF14). For primates, positive

Page 20: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Summary

viii

selection was found solely in KIF14. Whereas, in nonprimate placental mammals four

genes STIL, ZNF335, SASS6, and CIT have exhibit the signature of adaptive

evolution. However, pervasive positive selection has acted in STIL, ZNF335 and

KIF14 for placental mammals. This study also identified acceleration in the coding

sequences of WDR62 and STIL for human terminal branch both by codon

substitutions and frequency based methods. However, acceleration in STIL gene is not

significant by codon substitutions based method. Furthermore, the signatures of

divergent selection constraints between clades are significant for only two genes STIL

and SASS6.

In the present study, in an endeavor to elucidate whether and why Parkinson‘s disorder

affects solely Homo sapiens. Evolutionary study of Parkinson‘s disease associated α

synuclein gene revealed that α synuclein gene has been originated specifically at the

root of jawed vertebrates and no evolutionary substitutions was accumulated in the α

synuclein amino acid sequence during the last 35 million years of evolution.

Furthermore, structural dynamics enlighten that during the course of vertebrate

evolutionary history, region of amino terminal domain (32 to 58 amino acids) of α

synuclein was continuously evolved at structural level, in spite of high sequence

conservation at sequence level.

Conclusion: This study concluded that evolutionary enlargement of human brain

size during Pliocene-Pleistocene period might have not associated to the human

MCPH coding sequences exclusively. The joint human specific changes in coding and

noncoding regions of human microcephaly loci might have been conducive to the

modification in the function of MCPH genes in humans that likely to be responsible

for the human brain evolution during the last two million years.

Current study on evolution of α synuclein gene provide that region encompassing 32-

58 amino acid residues of amino terminal domain is critical for normal cellular

function and Parkinson‘s disease pathogenesis.

Page 21: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

1

Introduction

Homo sapiens is substantially different from other non-human primates by its unique

morphological, anatomical, physiological and behavioral features, including relative

brain size, bipedalism, craniofacial attributes, small canine teeth, dimensions of pelvis,

vocal organs, hairless skin, opposable elongated thumb, shortened fingers, language

and advanced tool making capabilities (Carroll, 2003; Gagneux & Varki, 2001). These

unique human specific phenotypic traits are emerged during the last 6 million years of

evolution after its divergence from Pan lineage. The evolution of modern human

unique eccentrics was not a linear, additive process, and knowledge about pattern,

magnitude and rate of change can only be studied through comparative analyses not

only from extant and extinct nonhuman primates but also from extinct hominid that

exist between the period of last 5 million years. Extant and extinct primate species

together can provide the answer of these questions. First, what distinguishes hominid

from hominidae? Second, what distinguishes hominin from hominids? Third, what

distinguishes anatomically modern humans from hominin? And last, ideas about the

precise timing of apparition of unique human eccentrics. This view consolidated by the

discovery of oldest and primitive extinct hominid Sahelanthropus tchadensis that

exhibits chimpanzee-sized brain but later hominid like dental, basicranium and facial

features, indicating that bipedalism arose soon since the divergence of Homo sapiens

from Pan (Pan troglodytes and Pan paniscus) lineage (Brunet et al., 2002; Zollikofer

et al., 2005). The decoding of extinct hominin and extant nonhuman primates genomes

were provided an opportunity for evolutionary biologist to precisely understand how

and what genetic underpinnings ensued in the evolution of human distinctive oddities.

The capability to pinpoint those genomic changes that have carved the Homo sapiens

unique eccentrics be contingent on the association of numbers of genes to human

specific phenotypes and as well on the detection the modern human specific changes

within coding and noncoding sequences. In genetic perspective, human eccentricities

arise by concomitant changes in the protein coding and conserved non-coding

sequences; however, the precise genetic underpinnings of these eccentricities still

remains enigmatic (Olson & Varki, 2003; Vallender, Mekel-Bobrov, & Lahn, 2008).

Page 22: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

2

1.1 Human Brain Evolution

Of paramount, defining attribute of human evolution is the structurally complex brain

that differs from nonhuman primates in size, shape, organization and functions. Homo

sapiens brain is three fold bigger than our closest extant relatives the chimpanzee and

approximately 6-8 folds bigger than that of extant old world monkeys and platyrrhini

(Figure 1.1 & Table 1.1) (Semendeferi & Damasio, 2000; Stephan, Frahm, & Baron,

1981). This expansion is heterogeneous across brain regions, the most notable

expansion occurred in neocortex that has been directly related to the emergence of

higher cognitive capabilities, such as language, intelligence and social learning

(Geschwind & Rakic, 2013). Expansion is not restricted to grey matter; upsurge in

white matter volume is also contributed toward the uniqueness of modern human brain

(Schoenemann, Sheehan, & Glotzer, 2005). Brain size expansion over last 6 million

years is not arising at constant rate; it is static or slow at some time and rapid in some

other evolutionary period. Until the middle-Pliocene epoch approximately between 3-

2.5 million years, all early hominids have nonhuman hominidae like brain size such as

Australopithecus afarensis exhibits 384 cm3 brain volume and populated earth

between 4 and 2.8 million years ago (McHenry, 1994). However, Homo erectus

appears on earth approximately 1.9 million years ago and had an average brain volume

of 950 cm3

(Rightmire, 2004). Between 1.9-1 million years of age brain did not change

in size significantly. After that accelerated brain expansion occurred in middle-

Pleistocene species such as Homo heidelbergensis considered to be another ancestor of

archaic hominin and had average brain size three times larger than Pan (Table 1.1)

(Rightmire, 2013). So, the significant rapid expansion in brain size occurred in early

Pleistocene and yet again in middle-Pleistocene epoch. Homo neanderthalensis had

greater brain size approximately 1512 cm3 as compared to Homo sapiens whose

average brain size is 1355 cm3 (Table 1.1).

Globular brain shape of Homo sapiens (modern human) distinct us from our closest

extinct archaic hominins the Homo neanderthalensis indicating that globularity

emerged after the divergence of anatomically modern humans from Neandertals and

Denisovans approximately 500,000 years ago (Gunz et al., 2012; Neubauer, Hublin, &

Gunz, 2018; Prüfer et al., 2014). Fossils evidence revealed that Homo sapiens from

130,000 years ago have more globular shape than those exist in 200,000 years ago.

However, it is evident that Homo sapiens Brain shape evolved gradually and

Page 23: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

3

directionally within Homo sapiens in the upper Pleistocene period between 100,000-

35,000 years ago (Neubauer, et al., 2018).

Figure 1. 1: Comparative brain size of extant primates.

Blue lines highlight the superior temporal sulcus. Species within red circle are belongs to apes, while in

purple and cyan blue circles are old world monkeys and platyrrhini respectively. Adapted from [(Bryant & Preuss, 2018)].

Page 24: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

4

Table 1. 1: Comparative brain size of extinct and extant primates.

Primate Species Brain size (cm3) Estimated age

(Million years)

Geological epoch

Homo sapiens 1355 0-0.2 Middle Pleistocene

Homo neanderthalensis 1512 0.03-0.550/750 Middle Pleistocene

Homo heidelbergensis 1198 0.3-1 Middle Pleistocene

Homo erectus 1016 0.2-1.9 Upper Pliocene

Homo ergaster 854 1.5-1.9 Upper Pliocene

Homo rudolfensis 752 1.8-2.4 Pliocene

Homo habilis 552 1.6-2.3 Pliocene

Panthropus boisei 510 1.2-2.2 Pliocene

Australopithecus africanus 457 2.6-3 Pliocene

Australopithecus afarensis 384 3-3.6 Pliocene

Sahelanthropus tchadensis 370 ~ 6-7 Upper Miocene

Pan troglodytes 336 ~ 0-7 Upper Miocene

Pan paniscus 311 ~ 6-7 Upper Miocene

Gorilla gorilla 425 ~ 0-7/9 Miocene

Pongo abelli 445 ~ 0-14 Miocene

Old word monkeys 33-205 ~ 0-25 Upper Oligocene

Platyrrhini 4-123 0-35/40 Upper Eocene

Taken from (Carroll, 2003; Semendeferi & Damasio, 2000; Vallender, et al., 2008; Zollikofer, et al.,

2005)

1.1.1 Human brain regions evolved during Pliocene-Pleistocene epochs

Cerebral cortex surface area in human is increased three fold during the last 5 million

years since the divergence from chimpanzee but majority of this enlargement is

initiated in upper Pleistocene. Prefrontal, temporal and parietal lobe thought to be

involved in higher cognitive capabilities is lager in humans as compared to nonhuman

primates and enlargement in these areas are generally associated with cultural and

behavioral complexity. Prefrontal volume expanded in human and occupied

disproportionately large amount of not only grey matter but also white matter as

compared to nonhuman primates (Donahue, Glasser, Preuss, Rilling, & Van Essen,

2018). Furthermore, relative to nonhuman hominids, orbitofrontal cortex (region of

prefrontal cortex) is explicitly wider in anatomically modern human. Although certain

widening of parietal regions volume observed in Neandertals, but the generalized

expansion of entire parietal surface is a unique characteristic of anatomically modern

human (Bruner, 2010). Furthermore, upper parietal surface bulging is also specific to

modern humans. Parietal lobe has involved in speech decoding, numerical processing,

and sensory information processing, so modern human specific parietal bulging and

expansion might have some implications in specie specific higher cognitive

specialization (Figure 1.2). Anatomically modern human have relatively larger overall

Page 25: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

5

volume, white matter volume and apomorphic location of temporal lobe (Bastir et al.,

2011). Homo sapiens had significantly larger cerebellar hemispheres as compared to

Homo neanderthalensis, prominently on the right side. Larger cerebellar hemispheres

have been known to implicate in executive functions including language processing,

working memory capacity and social complexity (Kochiyama et al., 2018). During the

evolution of Homo sapiens brain, enlargement in the magnitude of brain is not the

merely evolutionary change; alterations also occurred at microstructural and

organization, connection level including higher order organization of cortex, cellular

and laminar organization, long distance cortical connection (Todd M Preuss, 2011).

Modern human brain enlargement accompanied extensive modification and increased

number of connections between the subregions of brain.

Figure 1. 2: Endocrinal differences in genus homo during Pliocene-Pleistocene.

Both modern human and Neandertals show widening in frontal and lateral parieto-temporal lobes as

compared to other hominids. While entire parietal enlargement and parietal surface bulging found only

in modern humans. [Adapted from (Bruner, 2010)]

1.2 Autosomal recessive primary microcephaly (MCPH)

The microcephaly is derived from two Greek words micro from ―mikros‖ (small), and

cephaly from ―Kephale‖ (head). The prominent phenotype of the humans suffering

from microcephaly is small head size (Figure 1.3). Autosomal recessive primary

microcephaly (MCPH) is a rare congenital brain developmental disorder characterized

by reduced head circumference (HC) or occipitofrontal circumference (OFC) that is

Page 26: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

6

lesser than three standard deviation (SD) at birth with mild to moderate mental

retardation in the absence of any other neuroanatomical etiology (Woods, Bond, &

Enard, 2005). The small occipitofrontal circumference is a consequence of the

reduction in the size of cerebral cortex which leads to a simplified gyral patterning

without affecting the cerebral cortex thickness. Magnetic resonance imaging (MRI) of

primary microcephaly patients have shown reduction in brain size particularly affected

the frontal lobes of cerebral cortex due to neuronal proliferation defect but with the

normal architecture of brain (Desir, Cassart, David, Van Bogaert, & Abramowicz,

2008; Saadi et al., 2009). Primary microcephaly patients usually have intellectual

disability and language delay, with varying degree of motor delay. The rate of

incidence of primary microcephaly is higher in Middle Eastern and Asian populations

where the consanguineous marriages are more usual than in Caucasian populations.

The prevalence of primary microcephaly is reported 1 in 10,000 in Asian and Middle

Eastern populations. Primary microcephaly is captivating disorder and considered to

be consequence of atavistic process because it disturbed the brain to body size ratio,

whereby the brain size of microcephaly patient is equivalent to that of our closest

living relatives great apes brains and extinct early hominids (sahelanthropus and

australopithecus) brains (Mochida & Walsh, 2001).

Figure 1. 3: Comparative view of normal and microcephalic brain.

Page 27: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

7

Left side microcephalic patient and right side aged match control. Microcephalic patient show severe

reduction in brain volume. Poorly developed frontal lobe and angiogenesis of rostrum corpus callosum

(white arrow) in microcephalic patient as compared to an aged match control. [Adapted from (Kaindl et

al., 2010)].

1.3 Primary microcephaly genes and their functions

Autosomal recessive primary microcephaly is genetically heterogeneous disorder.

Atleast eighteen loci (MCPH1-18) have been identified to responsible for primary

microcephaly at different human chromosomes (Table 1.1) (H. Li et al., 2016). These

underlie genes are MCPH1, WDR62, CDK5RAP2, CASC5, ASPM, CENPJ, STIL,

CEP135, CEP152, ZNF335, PHC1, CDK6, SASS6, MFSD2A, CIT and KIF14 (Awad

et al., 2013; Basit et al., 2016; Bond et al., 2002; Bond et al., 2005; Genin et al., 2012;

Guernsey et al., 2010; Gul et al., 2006; Muhammad Sajid Hussain et al., 2012;

Muhammad S Hussain et al., 2013; Jackson et al., 2002; Khan et al., 2014; Kumar,

Girimaji, Duvvari, & Blanton, 2009; H. Li, et al., 2016; Moawia et al., 2017; Adeline

K Nicholas et al., 2010; Y. J. Yang et al., 2012). Almost all MCPH genes expressed

mainly in fetal brain and have a dominant contribution in the regulation of

neurogenesis and cytokinesis, which in turn control the brain size (Basit, et al., 2016).

Figure 1. 4: MCPH genes from different pathway to cause microcephaly.

Page 28: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

8

Sixteen MCPH genes have been identified, of which nine and two encode centrosome and cytokinesis

genes respectively. The organelles implicated in primary microcephaly are shown in red. [This picture

is adapted from (Jayaraman, Bae, & Walsh, 2018)].

1.3.1 Microcephalin

Microcephalin gene (MCPH1) contains 14 exons across the genomic region of 241905

bp at human chromosome 8p23.1 (Jackson, et al., 2002). MCPH1 is the first gene

detected as a causative agent of autosomal recessive primary microcephaly in two

Pakistani families. MCPH1 gene encodes 835 amino acids and contains three BRCT

(breast cancer1 carboxyl-terminal) domains, one at amino terminal and two at carboxyl

terminal of MCPH1 protein. BRCT domain present in DNA repair and cell cycle

proteins, and they involve in protein-DNA and protein-protein interactions particularly

interact with those proteins that phosphorylated on serine/threonine residues (Huyton,

Bates, Zhang, Sternberg, & Freemont, 2000; Yu, Chini, He, Mer, & Chen, 2003).

Mutations in MCPH1 gene are not only responsible for autosomal recessive primary

microcephaly but also cause premature chromosome condensation syndrome

(Trimborn et al., 2004). MCPH patients due to mutations in microcephalin have a

capacity to contain head circumference lesser than 4 standard deviation at birth (Evans

et al., 2005). Expression study of human fetal tissue by RT-PCR shows that MCPH1 is

expressed in fetal brain, kidney and liver with analogous level (Jackson, et al., 2002).

The expression of MCPH1 is also noted in other tissue such as heart, lungs, spleen,

thymus, skeletal muscles and some adult tissues at low levels (Jackson, et al., 2002). In

situ hybridization studies revealed that MCPH1 gene is expressed high level during

neurogenesis in the developing forebrain specifically lateral ventricles walls (Jackson,

et al., 2002). Microcephalin regulates BRCA1 and BRCA2 and contributes to the DNA

repair process, disruption in this DNA repair mechanism due to loss of function of

DNA repair protein can leads to excessive programmed cell death during neurogenesis

that explain the reduction in brain size by mutated microcephalin. Recently

microcephalin is used as a novel biomarker for the diagnostic of breast cancer

associated with BRCA1 inactivation (Richardson et al., 2011).

1.3.2 WD repeat domain 62 (WDR62)

MCPH2 gene WDR62 is the second most common cause of primary microcephaly

after ASPM gene. Homozygous mutations in WDR62 genes cause primary

microcephaly and some other cortical development malformations such as

Page 29: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

9

lissencephaly, pachygyria, agenesis of corpus callosum and schizencephaly (Bilguvar

et al., 2010; Kousar et al., 2011). Heterozygous mutations in WDR62 cause

polymicrogyria in humans (Murdock et al., 2011). WDR62 gene reside on human

chromosome 19q13.12 and straddling genomic region of 50230bp (Memon et al.,

2013). WDR62 gene encompasses 32 exons and encodes 1523 amino acids log spindle

pole protein.

WDR62 has been characterized by comprehending fifteen amino terminal WD40

domains, MKK7β1 binding domain, JNK binding and loop helix domain at carboxyl

terminal (Pervaiz & Abbasi, 2016). Carboxyl terminal region of WADR62 does not

share definite sequence homology to any known protein.WDR62 is localized in

nucleus, cytoplasm and spindle pole, its localization contingent on type of cell and

stage of cell cycle (Bogoyevitch et al., 2012). WDR62 is expressed at the accelerated

rate in the ventricular and subventricular zones of neuroepithelium during

neocorticogenesis (Bilguvar, et al., 2010; Adeline K Nicholas, et al., 2010). During

neurogenesis, cortical neurons originate from the progenitor cells in the ventricular

zone of the developing brain. The progenitor cells undergo a cycle of proliferative

symmetric divisions before moving to neurogenic asymmetric divisions. The transition

from proliferative division to neurogenic division is controlled by spindle pole

orientation and defect in spindle pole orientation resulted defect in this switching and

ultimately lead to primary microcephaly (Figure 1.2) WDR62 has been implicated in

the spindle pole formation and orientation regulation and might have been conducive

to prolonged human specific neural proliferative division that is consistent with

expansion of human brain size (Cohen-Katsenelson, Wasserman, Khateb, Whitmarsh,

& Aronheim, 2011; A. K. Nicholas et al., 2010).

1.3.3 Cyclin-dependent kinase 5 regulatory associated protein 2

(CDK5RAP2)

Mutated MCPH3 gene CDK5RAP2 is considered to be a very rare cause of primary

microcephaly because only eleven families have been reported worldwide that are

affected by MCPH3 gene mutations (Abdullah et al., 2017; Bond, et al., 2005;

Moynihan et al., 2000). CDK5RAP2 is the third earliest cause of primary

microcephaly gene encompasses 34 exons and located at human chromosome 9q33.2

(Bond, et al., 2005; Moynihan, et al., 2000). Like WDR62, homozygous missense

Page 30: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

10

mutations in CDK5RAP2 are identified to cause hypoplasia of corpus callosum (ACC)

that is present in 3-5% of individual affected by neurodevelopment disorder (Jouan et

al., 2016). CDK5RAP2 expressed in developing brain, kidney, lungs, placenta and

testis (Park et al., 2015).

Figure 1. 5: The role of WDR62 during neocorticogenesis.

During neocorticogenesis, WDR62 is supposed to play a key role both in symmetric divisions of apical

precursors and migration of neurons. Homozygous mutations disrupt the normal function of WDR62

which consequently alter the timing of proliferative division and also affect the neural migration to their

final destination. M: marginal zone, CP: cortical plate, SVZ: subventricular zone, and VZ: ventricular

zone. [Adapted from (Wollnik, 2010)].

CDK5RAP2 is 215kDa pericentriolar protein, also known as centrosomal associated

protein 215 (CEP 215) encodes 1893 amino acids and contains EB1 binding domain,

CDK5R1 interacting domain, p53 binding domain and several SMC (structural

maintenance of chromosome) domains (Sukumaran et al., 2017). The amino terminal

region of CDK5RAP2 encompasses γTuRC binding site that is considered

indispensible for the recruitment of γTuRC toward centrosome which leads to the

production of the microtubules and form spindle pole (Sukumaran, et al., 2017).

CDK5RAP2 has also involved in DNA damage signaling, centriole replication,

asymmetric centriole inheritance and spindle checkpoint function (Barr, Kilmartin, &

Gergely, 2010; Barrera et al., 2010; Lizarraga et al., 2010; X. Zhang et al., 2009). It

has been implicated in cell fate determination during neurogenesis and loss of function

Page 31: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

11

of CDK5RAP2 could cause premature depletion of neural progenitor cells and thereby

primary microcephaly (Barr, et al., 2010). Previous study of mutated CDK5RAP2 in

the Hertwig's anemia mouse showed malformations in the development of cerebral

cortex and ultimately caused severe reduction in the brain size of mutant mice at birth

(Lizarraga, et al., 2010). During neocorticogenesis, CDK5RAP2 deficient cortical

progenitors display defect in spindle pole orientation subsequently increased the early

cell cycle exit and excessive apoptosis in neuronal cells, ultimately reduces the cortical

precursor pool (Lizarraga, et al., 2010).

1.3.4 Kinetochore scaffold 1 (KNL1)

KNL1 also known as CASC5 (Cancer susceptibility candidate 5) gene holds 18 exons,

straddling the genomic region of 70,322 bp within MCPH4 locus at human

chromosome 15q15.1 (Genin, et al., 2012; Jamieson, Govaerts, & Abramowicz, 1999).

KNL1 is a 265 KDa kinetochore protein encodes 2342 amino acids. KNL1 was

expressed at higher rate in ventricular zone as compared to subventricular zone in the

neocortex of developing brain at 13-16 gestational week (Fietz et al., 2012).

Homozygous mutations in KNL1 gene have been reported in different geographic

region and indeed cause primary microcephaly in Moroccans and Pakistani families

(Genin, et al., 2012; Szczepanski et al., 2016). Cognitive functions were impaired from

moderate to severe level in affected individual with MCPH4. These mutations induced

skipping of exon 18 and 25 which in turn creates a frameshift and introduces

premature stop codon, ultimately produces C-terminally truncated proteins. Carboxyl

terminal of KNL1 encompasses the regions that are essential for interaction with

ZWINT-1 and NSL1-MIS12 complex, which is indispensable for normal

chromosomal alignment and segregation (Genin, et al., 2012; Szczepanski, et al.,

2016). It is also implicated in the regulation of DNA damage signaling as impaired

KNL1 function in mutant fibroblast cells cause overactive pathways and eventually

chromosomal instability (Szczepanski, et al., 2016). Carboxyl-terminally truncated

KNL1 altered the shape of nuclei from ovoid (normally present control cells) to

lobulated and fragmented (Szczepanski, et al., 2016). KNL1-deficient cells showed

misalignment of chromosome and premature mitotic arrest in primary fibroblast

resulting inappropriate symmetric and asymmetric division ultimately produced

inefficient neural proliferation and abnormal brain size (Kiyomitsu, Obuse, &

Yanagida, 2007). It is very interesting to note that novel phosphorylation site was

Page 32: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

12

identified in human KNL1 protein at serine residue 1076 that is originated since the

human-Pan split approximately six to seven million years ago (D. S. Kim & Hahn,

2011). The gain of phosphorylation site solely in human might play a role in human

cell division and the evolution of brain size during the Pleistocene-Pliocene era.

1.3.5 Abnormal spindle-like microcephaly associated gene (ASPM)

MCPH5 gene ASPM contains 28 exons covering the genomic region of 62566 bp at

human chromosome 1q31.3 (Bond, et al., 2002; Pattison et al., 2000). ASPM is

considered most common cause of primary microcephaly as recessive mutations in

ASPM were found in 60% of affected individual till date (Létard et al., 2018). Human

ASPM protein contains 3477 amino acids that are annotated with putative amino

terminal region microtubule binding domain, two calponin homology domains and 81

IQ (isoleucine glutamine) motifs that are highly variable in numbers in orthologs.

ASPM is localizes both in centrosome and spindle pole during interphase and from

prophase through telophase respectively (Fish, Kosodo, Enard, Pääbo, & Huttner,

2006; Zhong, Liu, Zhao, Pfeifer, & Xu, 2005). The human ASPM ortholog in

Drosophila melanogaster asp is responsible for organizing and binding together

microtubules at spindle pole, while mutations in asp cause premature mitotic arrest

resulting decreased central nervous system development (Gonzalez et al., 1990).

However, recessive mutations in mouse ASPM cause not only mild microcephaly but

also major defect in male and female fertility and reduced the testis size in adult mice

(Pulvers et al., 2010). Within human ASPM coding region 147 mutations have been

reported in HGMD professional 2017 that are associated with primary microcephaly.

Primary microcephaly patients affected with ASPM mutations have normal gyral

patterning and cortical structure as compared to affected with WDR62 mutations

(Bilguvar, et al., 2010; Bond, et al., 2002). RT-PCR analysis showed that ASPM

expression in various embryonic and adult tissues. During fetal development human

ASPM expressed in brain, heart, lungs, liver, stomach, spleen, colon, skeletal muscles

and skin tissues (Bond, et al., 2002; Kouprina et al., 2005; Rhoads & Kenguele, 2005).

Northern blot and in situ hybridization analyses showed acceleration in the ASPM

expression during cortical neurogenesis, particularly at embryonic day E14.5 and

E16.5 in ventricular zone (Bond, et al., 2002). The role of ASPM during neurogenesis

is directly assessed. ASPM is expressed in ventricular zone at accelerated level during

proliferative division and progressively downregulated with their switching from

Page 33: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

13

symmetric proliferative to asymmetric neuroepithelial division demonstrating its role

in neuron production (Fish, et al., 2006). Like WDR62, ASPM controls the transition

of proliferative to neurogenic division. ASPM and WDR62 interact and perform

indispensible role in centriole duplication. Deletion of one of the two genes (WDR62

and ASPM) greatly enhanced the mutated phenotype of other gene such as leads to

severity of primary microcephaly, while deletion of both genes is embryonically lethal

(Jayaraman et al., 2016). ASPM and WDR62 play role in centriole biogenesis

regulation, cell fate determination and apical complex that explain the functions of

these gene in brain expansion (Jayaraman, et al., 2016).

1.3.6 Centrosomal associated protein J (CENPJ)

Like MCPH3, mutations in MCPH6 gene are considered to be a rare cause of primary

microcephaly. Human CENPJ gene harboured 17 exons encode 1338 amino acids,

covering the genomic DNA of 39847 bp within MCPH6 locus at human chromosome

13q12.12-12.13. Splicing mutation in CENP has been reported to cause Seckle

syndrome (Al-Dosari, Shaheen, Colak, & Alkuraya, 2010). CENPJ was highly

expressed in brain and spinal cord but it also widely expressed in developing embryo

at low level (Bond, et al., 2005). In early neurogenesis, primary expression of CENPJ

is detected in neuroepithelium of frontal cortex (Bond, et al., 2005). It contains

microtubule binding domain, microtubule destabilizing domain that harboured 112

amino acid long PN2-3 motif, 5 coiled-coiled domain and two 14-3-3 binding sites at

carboxyl terminus of the protein (Chen, Olayioye, Lindeman, & Tang, 2006; Hung,

Chen, Chang, Li, & Tang, 2004). CENPJ localizes in centrosome throughout the cell

cycle but microtubule independent way. CENPJ is also involved in gamma tubulin

complex. In vivo analysis revealed that microtubule assembly is initiated at centrosome

by gamma tubulin complex (Schiebel, 2000). CENPJ binds to tubule heterodimers by

PN2-3 motif and impede not only microtubule nucleation from the centrosome and

also depolymerization of microtubules, indicating its role microtubules assembly at

centrosome and kinetochore that is predicted to be important during mitosis for proper

chromosomal segregation (Hung, et al., 2004). Ortholog of human CENPJ in

Caenorhabditis elegans Sas-4 play a key role in controlling centrosome organization

and centriole duplication (Kirkham, Müller-Reichert, Oegema, Grill, & Hyman, 2003;

Leidel & Gönczy, 2005). Mutations in Drosophila melanogaster DSas-4, ortholog of

human CENPJ revealed loss of centrioles during embryonic development and also

Page 34: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

14

indicated that 30% abnormal asymmetric division of neuroblasts cells (Basto et al.,

2006). DSas-4 deficient cell showed abnormal spindle formation that is responsible for

the dramatic chromosomal segregation defects probably due to the loss of centrioles

(Rodrigues-Martins, Riparbelli, Callaini, Glover, & Bettencourt-Dias, 2008). CENPJ

play a role in centriole biogenesis regulation, maintenance of centrosome integrity

might explain the controlling the brain size during human development and loss of

function of CENPJ cause primary microcephaly probably due to the loss of mature

centrosomes and impaired spindle positioning (Cho, Chang, Chen, & Tang, 2006).

1.3.7 SCL/TAL1 interrupting locus (STIL)

STIL gene was located at MCPH7 locus harbour 20 exons encoding 1288 amino acid

log pericentriolar protein, spanning the genomic region of 63,018 bp at human

chromosome 1p33 (Kumar, et al., 2009). STIL gene also well-known as SIL was

initially linked with T cell acute lymphoblastic leukemia (Aplan, Lombardi, & Kirsch,

1991). Homozygous truncated mutations in human STIL gene has been responsible for

not only primary microcephaly but also for lobar holoprosencephaly (Kakar et al.,

2015; Kumar, et al., 2009). STIL is expressed essentially in all human fetal tissues at

gestational week 16, and its expression in developing brain indicated its role neuronal

proliferation (Kumar, et al., 2009). Expression of human STIL gene is reported in

proliferating cells in early embryonic development and maximum in human fetal

thymus, bone marrow, colon and fetal liver (Izraeli & Colaizzo-Anas, 1997). In situ

hybridization study of mice showed the expression of STIL in the developing cerebral

cortex specifically in the subventricular neuroepithelial cells at embryonic day E14.5

(Kumar, et al., 2009; C. M. Smith et al., 2006). STIL-deficient mice expire in uterus

after embryonic day E10.5. Several developmental abnormalities appeared in STIL-

deficient mice between embryonic days E7.5-8.5 such as restricted development,

reduced proliferation and enhanced apoptosis, neural tube closure defects,

holoprosencephaly, left-right asymmetry defect and overall reduced size as compared

to wild type mice (Izraeli et al., 1999). Correspondingly, csp-mutant zebrafish

embryos, ortholog of STIL in zebrafish reveals elevated level of mitotic index with

disorganized mitotic spindles and also reported that homozygous csp-mutant embryos

were lethal in early developmental stage (Pfaff et al., 2007). STIL is essential for

proper spindle pole organization in vertebrates and regulate centrosome integrity and

mitosis (Castiel et al., 2011). STIL interact directly to another MCPH protein CENPJ

Page 35: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

15

by 231-619 residues, in turn this complex binds to another spindle assembly abnormal

protein 6 homolog (SASS6) and form a complex that is important in cell division and

centriole biogenesis regulation. Mutation in human CENPJ significantly decreases the

binding capacity to STIL (Tang et al., 2011). Furthermore STIL is phosphorylated at

serine/threonine sites during mitosis to promote its binding with Pin1 and affects the

spindle checkpoint duration (Campaner, Kaldis, Izraeli, & Kirsch, 2005).

1.3.8 Centrosomal protein 135 (CEP135)

Biallelic truncated mutation in CEP135 gene was identified to responsible for MCPH8

(Muhammad Sajid Hussain, et al., 2012). MCPH8 gene CEP135 contains 26 exons,

spanning the genomic region of 84.382 kbp at human chromosome 4q12 (Muhammad

Sajid Hussain, et al., 2012). CEP153 was expressed in neuroepithelium of mouse

cerebral cortex during embryonic day E11.5-15.5. CEP135 gene encodes highly

conserved centrosomal protein of 1140 amino acids and contains two coiled-coiled

regions in amino terminal. Like CENPJ, CEP135 is a centrosomal component and

localized to centrosome throughout the cell cycle but microtubule independent manner

(Ohta et al., 2002). Human CEP135 interact with two other MCPH genes CENPJ and

hSas6, and this interaction is essential for centriole assembly (Lin et al., 2013).

Similar to MCPH3 associated CDK5RAP2 protein, mutant CEP135 was also altered

the shape of nuclei from oval to lobulated and fragmented in approximately 20%

primary fibroblasts of microcephaly patients (Muhammad Sajid Hussain, et al., 2012).

Furthermore, impaired function of CEP135 in primary fibroblasts also showed other

anomalies such as centrosome number abnormalities (complete loss of centrosome was

observed in 22% of cells while elevated level of chromosome number in 18% of cells),

microtubule organization defects (Muhammad Sajid Hussain, et al., 2012). CEP135-

deficient CHO cells and mutant primary fibroblasts showed significant reduction in

growth rate (Muhammad Sajid Hussain, et al., 2012; Ohta, et al., 2002). CEP135

deficiency by RNA interference triggered the premature centrosome splitting

mechanism and disorganization microtubules arrangement (K. Kim, Lee, Chang, &

Rhee, 2008; Ohta, et al., 2002). Function of CEP135 is significant in centriole

biogenesis, spindle organization, and cytokinesis regulation.

Page 36: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

16

1.3.9 Centrosomal protein 152 (CEP152)

Gene responsible for MCPH9 locus on human chromosome 15q21.1 was identified as

CEP152 and contains 26 exons encodes 1710 amino acids long centrosomal protein

(Guernsey, et al., 2010). Previously, CEP152 gene is assigned to MCPH4 locus until

the discovery of KNL1 gene mutations associated with primary microcephaly (Genin,

et al., 2012; Guernsey, et al., 2010). CEP152 is expressed in mouse brain tissues at

embryonic stage E12.5 and E14.5 (Guernsey, et al., 2010). Homozygous missense and

truncated mutations of CEP152 gene have been reported to cause primary

microcephaly (Guernsey, et al., 2010). Biallelic mutations in CEP152 also cause

Seckle syndrome (Kalay et al., 2011). Patients affected with MCPH9 gene have head

circumference within the range of 5-7 standard deviations below mean with simplified

gyral pattern but normal cortex thickness (Guernsey, et al., 2010). CEP152-truncated

mutant was not found in centrosome in transfected cells (Guernsey, et al., 2010).

Though, overexpression and antibody staining analyses revealed centrosomal

localization of human CEP152 (Kalay, et al., 2011). CEP152-deficient human

fibroblast cells revealed the presence of multiple nuclei with variable size and

fragmented centrosome and elevated level of aberrant cell division, while these cells

seemed to be arrested in early anaphase (Kalay, et al., 2011). CEP152 is involved in

the regulation of genomic integrity and DNA damage response through interaction

with genome maintenance protein CINP (CDK2 interacting protein) (Kalay, et al.,

2011). Mutations in asl (asterless) gene, ortholog of CEP152 in Drosophila

melanogaster lead to embryogenesis arrest and male infertility in flies (Blachon et al.,

2008). Subcellular localization analysis of asl gene revealed its association with

centrosome at centrioles periphery and regulate the initiation of centriole duplication

(Blachon, et al., 2008).

1.3.10 Zinc finger protein 335 (ZNF335)

ZNF335 also known as NIF-1 gene contains 28 exons encodes 1342 amino acids,

spanning the genomic DNA of 23,519 bp at human chromosome 20q13.12 (Y. J.

Yang, et al., 2012). Mutations in MCPH10 gene ZNF335 cause one of the most severe

form of primary microcephaly with head circumference 9 standard deviations below

means (Sato et al., 2016; Stouffs et al., 2018; Y. J. Yang, et al., 2012). MRI study of

patients affected with MCPH10 gene revealed cortex size reduced greatly as compared

Page 37: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

17

to skull with more severe simplified gyral pattern and extra-axial space, invisible basal

ganglia and also show neuronal disorganization (Stouffs, et al., 2018; Y. J. Yang, et

al., 2012). ZNF335 is ubiquitously expressed in variety of human fetal (brain, lungs,

liver and kidney) and adult organs. ZNF335 expression is elevated during mouse

cortical neurogenesis at embryonic stage E13-E15 (Y. J. Yang, et al., 2012). During

neurogenesis, ZNF335 is expressed in ventricular and subventricular zones as well as

in the developing cortical plate but at lowest level. ZNF335 is essential for embryonic

development as ZNF335-deficient mice cause increase cell death and it is

embryonically lethal in early development stage before E7.5 (Y. J. Yang, et al., 2012).

ZNF335 is involved in histone trimethylation regulation and control the expression

level of variety of somatic and brain developmental genes. Microarray analysis

revealed that ZNF335-defcient neurons displayed reduced expression of brain

developmental genes particularly DLX homeobox genes, REST/NRSF, Co-REST 2

gene involved in early brain development and neurogenesis respectively (Y. J. Yang,

et al., 2012). ZNF335 is essential for neurogenesis and neuronal differentiation and

migration in mammals.

1.3.11 Polyhomeotic homolog 1(PHC1)

Mutation in PHC1 gene has been reported to affect two siblings in consanguineous

Saudi family with primary microcephaly through recessive mode of inheritance

(Awad, et al., 2013). It is the twelfth gene (MCPH12) that is associated with primary

microcephaly. PHC1 gene holds 15 exons, spanning the genomic region of 25,194 bp

at human chromosome 12p13.31 (Awad, et al., 2013). PHC1 gene also known as

EDR1 encodes 1004 amino acid long protein and has characterized by the presence of

carboxyl terminal SAM (sterile alpha motif) domain. It is considered as an essential

component of polycomb repressive complex 1 (PRC1) that maintain the

transcriptionally repressive state of HOX genes (Isono et al., 2005). Similar to another

MCPH gene ZNF335, PHC1 is localized in cell nucleus (Alkema et al., 1997; Cmarko,

Verschure, Otte, van Driel, & Fakan, 2003; N. Hashimoto et al., 1998). Mutation in

PHC1 gene increases the expression of GMNN (geminin) and decreases H2A

ubiquitination and recruitment of PHC1 to chromatin due to reduction in PHC1 protein

level, ultimately impaired the DNA repair system and cell cycle activity in patient‘s

cells (Awad, et al., 2013). Microarray analysis of PHC1-mutated patient‘s cells

revealed significant dysregulation of those genes involved in cell cycle, cellular

Page 38: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

18

proliferation, apoptosis, DNA replication and DNA repair system (Awad, et al., 2013).

The implications of chromatin remodeling in the pathogenesis of primary

microcephaly was first time discovered by PHC1 mutation.

1.3.12 Cyclin-dependent kinase 6 (CDK6)

CDK6 gene contains eight exons encode 326 amino acids, covering the large genomic

region (221,454 bp) at human chromosome 7q21.2. CDK6 contains protein tyrosine

kinase domain and has involved in cell cycle G1 progression and regulate G1 to S

phase transition (Russo, Tong, Lee, Jeffrey, & Pavletich, 1998). Missense mutation in

CDK6 gene has been reported to be a underlying cause of MCPH12 locus defect in

Pakistani family (Muhammad S Hussain, et al., 2013). CDK6 is localized in cytoplasm

and nucleus in the interphase cells, its presence also observed at centrosome

throughout the mitotic cycle (Ericson, Krull, Slomiany, & Grossel, 2003; Mahony,

Parry, & Lees, 1998). Immunofluorescence studies revealed that during neurogenesis

CDK6 is expressed in neuroepithelium during cerebral cortex development at

embryonic day E11.5 and also in the basal progenitor cells at embryonic days E11.5

and E15.5 (Muhammad S Hussain, et al., 2013). CDK6 is a key regulator to maintains

balance between proliferative symmetric and neurogenic asymmetric divisions that is

very important for the quantitative production of neurons (Beukelaers et al., 2011).

CDK6 mutation perturbs the proliferation of apical neuronal precursor cells and might

be loss the balance between proliferative and neurogenic division, ultimately reduced

the number of neurons that explain the cause of microcephaly phenotype by MCPH12

gene (Muhammad S Hussain, et al., 2013). CDK6-mutant fibroblasts revealed its

absence in centrosome and also showed mitotic spindle disorganization (Muhammad S

Hussain, et al., 2013). Similar to two other MCPH genes CEP135 and CEP152,

CDK6-mutant fibroblasts also showed other abnormalities including misshapen nuclei,

centrosome number anomalies, microtubule organization defect and reduction in

growth rate (Muhammad S Hussain, et al., 2013).

1.3.13 SAS-6 centriolar assembly protein (SASS6)

Homozygous missense mutation in SASS6 gene reported to cause primary

microcephaly in Pakistani family. Mutation occurred in highly conserved region of

PISA motif situated within the amino terminal domain (Khan, et al., 2014; van Breugel

Page 39: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

19

et al., 2011). Affected individuals had occipitofrontal circumference between the range

of -6.63 to -19.6 standard deviations below the mean and also had severe mental

retardation (Khan, et al., 2014). SASS6 present on MCPH14 locus and encompasses

17 exons, holding the genomic region of 49,392 bp at human chromosome 1p21.2.

SASS6 gene encodes 657 amino acids long Cartwheel protein and it is localized at

centrioles and cytoplasm (Nakazawa, Hiraki, Kamiya, & Hirono, 2007; Strnad et al.,

2007). SASS6 is characterized by amino terminal domain that contains two highly

conserved motif, coiled coil domain, and carboxyl terminal domain (van Breugel, et

al., 2011). SASS6 is essential for procentriole formation and centriole duplication and

functions in human, as depletion of SASS6 block the centriole duplication and

overexpression leads to centriole amplification (Arquint & Nigg, 2016; Strnad, et al.,

2007). In addition to SASS6, another MCPH gene STIL is also a core component of

centriole duplication (Arquint & Nigg, 2016). Knockdown study of DSAS-6, an

ortholog of SASS6 in Drosophila melanogaster revealed significant reduction in

number of centrosome in flies brain (Rodrigues-Martins et al., 2007). Furthermore,

SAS-6 is necessary for centrosome duplication cycle in Caenorhabditis elegans,

suggested that function of SASS6 is evolutionary conserved from human to nematodes

(Leidel, Delattre, Cerutti, Baumer, & Gönczy, 2005). SASS6 interact with two other

MCPH genes CEP135 and CENPJ and form a complex that is necessary for centriole

assembly (Lin, et al., 2013). Mutation in human SASS6 partially impaired its function

and has drastic effect on centriole formation and cell division, ultimately affected the

neurogenesis process (Khan, et al., 2014).

1.3.14 Major facilitator superfamily domain containing 2A (MFSD2A)

MFSD2A gene comprehends 14 exons across 14,816 bp long genomic region at

human chromosome 1p34.2. MFSD2A gene encodes 543 amino acids long plasma

membrane protein and transport docosahexanoic acid (DHA) as

lysophosphatidylcholine across the blood brain barrier (BBB) and also involved in the

formation of blood brain barrier (Ben-Zvi et al., 2014; Nguyen et al., 2014). MFSD2A

is expressed in various human fetal and adult (cortex, corpus callosum, cerebellum,

pons, spinal cord and liver) tissues. However it is expressed at elevated level in human

fetal brain particularly in the endothelium cells of BBB (Guemez-Gamboa et al.,

2015). Biallelic missense mutations in MFSD2A were reported to cause microcephaly

with and without lethality in three families from different ethnic groups including

Page 40: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

20

Libyans, Egyptian, and Pakistani (Alakbarzade et al., 2015; Guemez-Gamboa, et al.,

2015). Brain imaging revealed that affected individuals also exhibit some other

anomalies in addition to reduced cortex size such as brainstem and cerebellar

hypoplasia, cortical surface effacement, and significant deficiency in posterior white

matter (Alakbarzade, et al., 2015; Guemez-Gamboa, et al., 2015). Mutations do not

affect expression level and localization of MFSD2A while they impaired its transport

activity and reduced the brain uptake of lysophosphatidylcholine, suggesting that it is

essential for normal brain development (Guemez-Gamboa, et al., 2015). Congruently,

MFSD2A-knockout mice showed DHA deficiency in brain and exhibit severe

microcephaly with cognitive impairment. Furthermore, 40%of MFSD2A-knockout

mice expire in the early age of life (Berger, Charron, & Silver, 2012; Nguyen, et al.,

2014). Danio rerio has two inparalogs mfsd2aa and mfsd2ab for human MFSD2A. In

situ hybridization analysis revealed that these both inparalogs mfsd2aa and mfsd2ab

are expressed throughout the nervous system in the zebrafish embryos (Guemez-

Gamboa, et al., 2015). Knockdown analyses showed that these inparalogs have non

redundant functions in zebrafish as lethality observed for each paralog (Guemez-

Gamboa, et al., 2015). So, all these results specify that MFSD2A is a unique primary

microcephaly gene in a way that it provides a new insight into human brain evolution

and development.

1.3.15 Citron rho-interacting serine/threonine kinase (CIT)

Biallelic mutations in CIT gene present on MCPH17 locus cause primary

microcephaly in Egyptian and Saudi families (Basit, et al., 2016; H. Li, et al., 2016;

Shaheen et al., 2016). Affected individuals brains showed simplified gyral pattern,

agenesis of corpus callosum, and profound lack of white matter (Basit, et al., 2016;

Shaheen, et al., 2016). Lack of CIT has been reported to cause spindle orientation

defect in mammals and insects (Gai et al., 2016). CIT is localized in central spindle

and co-localized with another MCPH gene ASPM in midbody and may function

together neural progenitor division (Paramasivam, Chang, & LoTurco, 2007). CIT

gene contains 48 exons, spanning the genomic region of 191,501 bp at human

chromosome 12q24.23. CIT gene encodes 2069 amino acids long CRIK (citron rho-

interacting kinase) protein that has been characterized by two kinase domains, cysteine

rich, pleckstrin homology and carboxyl terminal citron homology domain. CIT is

expressed in the ventricular zone of neuroepithelium of the developing neocortex (Di

Page 41: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

21

Cunto et al., 2000). Expression of CIT also observed in adult tissues including brain,

lungs, kidney and spleen (Di Cunto et al., 1998). CIT has been implicated in the

regulation of cytokinesis pattern and progression during the central nervous system

development. Flathead rat model has been characterized by reduced brain size with

abnormal cerebral cortex development (Sarkisian, Rattan, D'Mello, & LoTurco, 1999).

Single nucleotide deletion in rat CIT gene has been reported causative mutation in

flathead rat and disrupted the cytokinesis in neural progenitor cells and increases

apoptosis (Sarkisian, Li, Di Cunto, D'Mello, & LoTurco, 2002). Congruent phenotype

was found in CIT-knockout mice that displayed significant reduction in brain size

particularly in hippocampus, olfactory bulb and cerebellum region due to depletion of

neurons and expire prior reaching to adulthood (Di Cunto, et al., 2000). CIT-mutated

phenotype in human might be as a result of disruption in cytokinesis and neurogenesis

along with elevated level of apoptosis in neuronal cells.

1.3.16 Kinesin family member 14 (KIF14)

Recently, it has been reported that homozygous and heterozygous mutations in KIF14

gene affect four families from different geographic regions (Pakistan, Germany, and

Saudi Arabia) with primary microcephaly (Moawia, et al., 2017). Another study

reported that KIF14 biallelic mutations cause lethal fetal anomalies in brain and kidney

(Filges et al., 2014). KIF14 gene encompasses 30 exons covering 69,234 bp long

genomic region at human chromosome 1q32.1. KIF14 gene encodes 1648 amino acids

and has been characterized by four domains amino terminal PRC1 binding, kinesin

motor, FHA and CRIK binding domains (Moawia, et al., 2017). Like MCPH17 protein

CIT, KIF14 is localized at central spindle and midbody and their localization are

codependent (Gruneberg et al., 2006). KIF14 is expressed in brain and kidney but at

elevated level in fetal brain development particularly at embryonic day E12.5-16.5

(Fujikura et al., 2013). KIF14-depleted human cells impaired the CIT localization at

central spindle and midbody ultimately induced cytokinesis failure followed by

apoptosis (Gruneberg, et al., 2006). KIF14 was not identified at midbody in the

primary fibroblast of affected individuals and explain the reason of resultant

phenotype. Similar phenotype is also observed in KIF14-knockout and laggard (novel

spontaneous mouse mutant) mice that showed reduction in brain size specifically most

dramatic in cortices and olfactory bulb, hypomelynation, and apoptosis (Fujikura, et

al., 2013).

Page 42: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

22

1.4 The cost of human brain size enlargement

Expansion of human brain during evolution underlies higher cognitive capabilities and

high social interaction complexity that set us apart not only from nonhuman primates

but also from our extinct hominids relatives. However, modern humans pay its cost in

the form of neurodegenerative disorders. Neurodegenerative disorders are a group of

chronic disorders characterized by slow progressive loss of specific type of neurons in

discrete regions of brain (Gao & Hong, 2008). Neurodegeneration will become the

world's second leading cause of death by the year 2040, overtaking cancer

(Kontopoulos, Parvin, & Feany, 2006; Siddiqui, Pervaiz, & Abbasi, 2016). The

neurodegenerative disorders include Alzheimer‘s disease (AD), Parkinson‘s disease,

frontotemporal dementia, amyotrophic lateral sclerosis (ALS), multiple sclerosis (MS),

Huntington‘s disease, Lewy bodies‘ disease (LBD), and Multiple System Atrophy

(MSA) (Gao & Hong, 2008). However, Alzheimer‘s and Parkinson‘s disorders are the

two most prevalent neurodegenerative disorders and are considered exclusively affect

and restricted to modern humans. It is generally accepted that dramatic brain evolution

during the last 1 million years makes human susceptible neurodegeneration. However,

Swiss scientist study consolidated this view where he established that brain regions

involved in neurodegenerative disorders are recently evolved in modern human during

Pleistocene-Pliocene age (Ghika, 2008).

1.5 Parkinson’s disease

Parkinson‘s disorder is the second most prevalent neurodegenerative disorder after

Alzheimer‘s which affects 1–2% of the population above age 65 and 4–5% above age

85 (Bisaglia, Mammi, & Bubacco, 2009). Neuropathologically, it is defined by the

presence of Lewy bodies, Lewy neurites and the loss of 78% dopaminergic neurons of

substantia nigra parse compacta which is parallel to loss of dopamine in neostriatum

that is involved in controlling the motor behavior (Siddiqui, et al., 2016).

Parkinson‘s disease is clinically defined by cardinal signs including bradykinesia,

resting tremor, rigidity, mask facial expression and postural instability (Lücking &

Brice, 2000). PD is genetically heterogeneous disease as atleast 11 genes not from a

common gene family are reported to cause PD, and many genes are identified to be

susceptible for PD (Figure). However, alpha synuclein is identified as the first

causative gene intricate in the early onset of familial PD.

Page 43: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

23

Figure 1. 6: Circular illustration of Parkinson’s disease associated genes on human chromosomes.

Genes that contain causative mutations are shown in blue. Two genes shown in red are those that

contain moderate effect protein coding risk alleles, while genes shown in black are identified by genome

wide analysis studies. [Adapted from (Singleton & Hardy, 2016)].

1.5.1 Alpha synuclein

Among three synucleins, α synuclein has received great attention because it was

emerged as a central protein in the pathophysiology of both early onset of autosomal

dominant familial and sporadic Parkinson‘s disorder. Six missense mutations (A30P,

E46K, H50Q, G51D, A53T, and A53E) in the amino-terminal lipid binding domain

and gene multiplications (duplications and triplication of α synuclein) caused early

onset of hereditary Parkinson‘s disease (Siddiqui, et al., 2016; Vekrellis, Xilouri,

Emmanouilidou, Rideout, & Stefanis, 2011). The α synuclein is vertebrate specific

presynaptic protein and is required for the synaptic vesicles endocytosis/exocytosis,

and also regulates presynaptic architecture and synaptic vesicle distribution (Vargas et

al., 2014; Vargas et al., 2017). Alpha synuclein encompasses five exons and encodes

140 amino acids long 14.5kDa protein. It is belong to a family of intrinsically

disordered group of proteins contain three members α, β, and γ that are reside on

Page 44: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

24

human FGFR paralogon and map to chromosome 4, 5 and 10 respectively (Campion et

al., 1995; Lavedan et al., 1998; Spillantini, Divane, & Goedert, 1995). α synuclein is

expressed predominantly in neocortex, hippocampus, striatum, thalamus, and

cerebellum and is particularly enriched at presynaptic terminals (George, 2002;

Solano, Miller, Augood, Young, & Penney, 2000). Expression of α synuclein is also

observed in various other cells including heart, hematopoietic, lungs, ocular, cochlea,

and skeletal muscles, indicating more general function of α synuclein coupled with its

role in nervous system (M. Hashimoto & Masliah, 1999; Lücking & Brice, 2000;

Surguchov, McMahan, Masliah, & Surgucheva, 2001).

1.6 Archaic human genomes

Extinct archaic humans (Neandertals and Denisovans) genome sequences provide a

new insight into the concrete events that transpired during human evolution. Archaic

humans (Neandertals and Denisovans) split recently from modern human

approximately 550,000-750,000 years ago and were extinct from earth almost 30,000

years ago (Prüfer, et al., 2014). These two archaic humans are considered to be the

closest extinct relatives of modern humans. Svante Pääbo group of Max Planck

Institute for Evolutionary Anthropology was sequenced the complete genomes of two

archaic humans, the Neanderthals and Denisovans with 50 × and 30 × sequence

coverage, respectively (Prüfer, et al., 2014; Reich et al., 2010). Both archaic humans

contributed to the ancestry of modern humans populations. As mentioned above both

Neandertals and modern humans have comparable larger brain. Therefore,

identification of hominin (Neandertals, Denisovans and modern humans) and modern

human specific changes by comparative genomics approach made it possible to decode

the genetic basis of evolutionary enlargement of brain size and increased susceptibility

to neurodegenerative disorders in unprecedented details.

1.7 Aims & approach of study

Inquisitiveness in ascertaining the genetic basis of vital differences those are

responsible for evolutionary expansion of human brain size as comparison to

nonhuman hominidea. Evolutionary forces shaping species genomes that ultimately

influenced the variations in phenotypic traits between the species. Identification of

those genes that have exhibited some signatures of natural selection might have been

Page 45: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 1 Introduction

25

unveiled our understanding about individuality of Homo sapiens. The human primary

microcephaly genes are enthralling candidates in order to understand the evolutionary

enlargement in the magnitude of human brain because mutations in these MCPH genes

cause drastic diminution in human cerebral cortex size that is approximately similar in

the size of nonhuman hominidae. Along the way also try to elucidate why solely

modern humans have Parkinson‘s disease. The approach was taken following steps.

I. Paralogs and orthologs of MCPH candidate genes were identified across long

evolutionary distance from subphyla vertebrata to phyla porifera and

reconstruct phylogenetic tree in order to identify the how deep genes of interest

are rooted.

II. Strength and direction of natural selection acting on candidate MCPH genes

was tested in different datasets of eutherian established phylogeny on codon

based substitutions models. Natural selection was also tested within modern

human through frequency based methods for those candidate genes that have

accelerated rate of evolution in human branch as compared to our closest extant

relatives Pan.

III. Modern human specific residues will be identified by combining archaic

human‘s (Neandertals and Denisovans) data with placental mammals through

comparative analysis.

IV. Interspecies sequences differences occurred during evolution that ultimately

yield a clade specific function along with altered selection constraint between

the clades. So, functional divergence will be estimated for the orthologous

sequences between the different partitions of eutherian phylogeny particularly

with respect to evolutionary expansion of brain size in primates.

V. Evolutionary advantage of enlarged brain size in modern humans came up with

the price of susceptibility to Parkinson‘s disorder. In order to understand why

hereditary Parkinsonism only specific to humans, molecular evolutionary

analysis will be performed on Parkinson‘s disease associated alpha synuclein

gene.

The overall objective of this study is to attempt to find evolutionary genetic and

molecular basis underlying two complex human phenotypic traits; evolutionary

expansion of human brain size and human specific neurodegenerative disorder

Parkinson‘s disease using evolutionary comparative genomics approach.

Page 46: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 2 Materials and Methods

26

Materials and Methods

2.1 Dataset for genes linked with autosomal recessive primary

microcephaly

Genes from 10 microcephaly locus (MCPH) were included in this analysis. The

chromosomal location and amino acid and coding sequences of these 10 loci in human

were obtained from Ensembl genome browser (http:/www.ensembl.org) (Hubbard et

al., 2002). The closest putative paralogs for these 10 MCPH genes in human were

obtained both from Ensembl paralogy prediction approach and by similarity search

approach. Similarity search approach was carried out by performing reciprocal

BLASTP search of gene of interest against the human protein databases available at

National Center for Biotechnology Information (NCBI) (http:/www.ncbi.nlm.nih.gov)

(Altschul, Gish, Miller, Myers, & Lipman, 1990; Pruitt, Tatusova, & Maglott, 2007).

In order to hunt for true paralogs for these 10 human MCPH protein coding genes

phylogenetic relationships were carried out. The orthologous amino acid and coding

sequences in other metazoan species for these MCPH genes were retrieved from

Ensemble through reciprocal BLAST/BLAT search. The genomic information of

many metazoan species was not availablea at Ensemble genome browser, the

orthologues sequences of those species were collected from sequence databases

available at NCBI (http:/www.ncbi.nlm.nih.gov) using bidirectional BLAST hit

strategy (Altschul, et al., 1990; Pruitt, et al., 2007).

2.2 Sequence Alignment

Multiple sequence alignment is important in comparative and evolutionary genomics

studies. They enable phylogenetic tree estimations, duplication timing estimations,

ancestral sequence reconstructions, structure prediction, natural selection analyses, and

critical residue identification. CLUSTAL W and MUSCLE are two widely used

algorithm for aligning the homologous nucleotide and amino acid sequences (Edgar,

2004; Thompson, Higgins, & Gibson, 1994). CLUSTAL W multiple sequence

alignment are based on progressive method in which most similar sequences are

aligned first and then progressively move into the alignment of distantly related

sequences. For phylogenetic analysis of each MCPH gene family (WDR62, STIL,

ZNF335, SASS6, PHC1, MFSD2A, CEP135, CDK6, CIT, and KIF14)), amino acid

Page 47: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 2 Materials and Methods

27

sequences were aligned using CLUSTAL W and MUSCLE with default parameters

(Edgar, 2004; Thompson, et al., 1994).

Natural selection analyses require multiple sequence alignments that were generated

by phylogeny aware alignment algorithm. PRANK is a probabilistic multiple sequence

(DNA, codon, and amino acid) alignment program that provide evolutionarily correct

alignment as compared to other alignment methods (Löytynoja & Goldman, 2008).

PRANK conceded insertion and deletion as a distinct evolutionary event and

introduces indel instead of aligning too divergent sequences and reduces the number of

false positive for evolutionary analysis (Fletcher & Yang, 2010; Löytynoja, 2014). The

orthologous coding sequences retrieved from placental mammals for each MCPH

protein coding gene (STIL, ZNF335, SASS6, PHC1, MFSD2A, CEP135, CDK6, CIT,

and KIF14) were aligned by PRANK with default parameters for empirical codon

model and used eutherian phylogenetic information as guide (Löytynoja & Goldman,

2008). The mammalian orthologous coding sequences of each MCPH gene were also

aligned by MUSCLE based on codon model with default parameters for the

reconstruction of mammalian phylogenetic tree for each gene (Edgar, 2004).

2.3 Phylogenetic tree reconstruction methods

2.3.1 Phylogenetic analysis by neighbor Joining (NJ) method

Phylogenies for each MCPH gene family and synuclein family were constructed to

understand the depth, and evolutionary histories of MCPH genes. The phylogenetic

trees for each MCPH gene family and synuclein family were reconstructed by

including the amino acid sequences from representative members of phyla vertebrata,

urochordata, cephalochordata, hemichordata, echinodermata, arthropoda, Mollusca,

annelida, cnidaria, placozoa and porifera through NJ method (Saitou & Nei, 1987). NJ

method reconstruct tree from distance matrix that contains pairwise evolutionary

distance between the features of a group of sequences. The uncorrected proportion (p)

distance, poisson correction and Jones, Taylor, and Thornton (JTT) methods were used

as amino acid substitution models to calculate evolutionary distances between

sequences (Jones, Taylor, & Thornton, 1992; Zuckerkandl & Pauling, 1965). All

positions comprehending gaps and missing data were eradicated with the help of

complete deletion parameter. The topological reliability of each NJ tree was evaluated

Page 48: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 2 Materials and Methods

28

by bootstrap method which produces the bootstrap score for each evolutionary

relationship between the branches of the tree on the basis of 500-1000

pseudoreplicates (Felsenstein, 1985).

2.3.2 Phylogenetic analysis by Maximum likelihood (ML) method

Phylogenetic trees for each MCPH gene family and synuclein family were also

reconstructed by cladistic/character based Maximum Likelihood (ML) method.

Whelan And Goldman (WAG) model was used as a amino acid substitution model

(Whelan & Goldman, 2001). The phylogenetic trees with the upmost log likelihood

scores are selected as final trees. For ML, Initial trees were generated automatically

using Neighbor Joining and BioNJ methods based on matrix of pairwise distances

calculated by a Jones, Taylor, and Thornton (JTT) model (Jones, et al., 1992).

Alignment columns encompassing missing data and gaps were removed with the help

of complete deletion parameter. The topological reliability of each ML tree was

evaluated by bootstrap method based on producing 500-1000 pseudoreplicates

(Felsenstein, 1985).

2.4 Ancestral state reconstruction

Ancestral sequence reconstructions (ASR) of proteins provide understanding about

how the natural selection shaped sequences and their function that ultimately change

the phenotypic traits during evolution (Groussin et al., 2014). ASR methods are

reconstructing the amino acid sequences of extinct ancestors that exist at the internal

nodes of phylogeny from sequences and the phylogeny of the extant species.

Maximum Likelihood (ML) method implemented in MEGA was used to reconstruct

ancestral sequences at internal nodes of phylogenetic tree of proteins of interest based

on amino substitution WAG model (Tamura et al., 2011; Whelan & Goldman, 2001;

Z. Yang, Kumar, & Nei, 1995). However, errors associated with ASR analyses were

eliminated by inferred ancestral sequences through PRANK program that accepted

insertions and deletions as distinct evolutionary events (Löytynoja & Goldman, 2008).

Page 49: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 2 Materials and Methods

29

2.5 Analysis of molecular macroevolution

2.5.1 Estimation of selective pressure on MCPH protein coding genes

Estimation the numbers of nonsynonymous (dN/ Ka) and synonymous (dS/Ks)

substitutions provide direct insight into mechanism of molecular sequence evolution

(Z. Yang, Nielsen, Goldman, & Pedersen, 2000; Z. Yang & Swanson, 2002). Selective

pressure (ω = dN/dS) acting on the coding sequence of genes is a rate ratio of

nonsynonymous to synonymous substitutions (Anisimova & Kosiol, 2008; R. Nielsen

& Yang, 1998). The ω ratio specifies the direction and strength of natural selection

operating on protein coding gene; 0 > ω < 1 indicates negative selection (greater

number of synonymous substitutions are accumulating fastly in the protein coding

sequence as compared to nonsynonymous substitutions), ω = 1 is congruous with

neutral evolution (equal number of non-silent and silent substitutions are amassing in

protein coding sequence), and ω > 1 represents positive Darwinian selection (number

of nonsynonymous substitutions are accruing faster than synonymous substitutions in

protein coding gene). Negative selection is operating on sequence when an existing

function or phenotype is evolutionary favorable or essential for particular trait. It is

generally acceptable that positive selection favors the adaptive function or phenotype.

The selective pressure ω was measured for nine MCPH genes (WDR62 excluded for

codon based maximum likelihood method because these methods were already

performed on the coding sequence of WDR62) by using maximum likelihood based

codon substitutions models implemented in CodeML program from PAML4.7

software (Wong, Yang, Goldman, & Nielsen, 2004; Z. Yang, 2007; Z. Yang, et al.,

2000). Several analyses were performed to check whether the positive selection was

acting on the coding sequences of autosomal recessive primary microcephaly genes

with respect to brain size evolution. Coding sequences from 48 placental mammalian

species (20 primates and 28 nonprimate placental mammals) provide sufficient

genomic coverage to perform these analyses (Figure 2.1). The ratio non-synonymous

(Ka) to synonymous (Ks) substitution rates for WDR62 in mammals were calculated

by using Pamilo–Bianchi–Li‘s method in MEGA5.05 (W. H. Li, 1993).

Page 50: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 2 Materials and Methods

30

Figure 2. 1: Phylogenetic tree of 48 placental mammal genomes.

The tree show the 48 species used in this study. The coding sequence of these species for MCPH loci

were retrieved from NCBI and Ensembl database.

2.5.2 Codon substitutions site models

Five codon substitutions site models (M1, M2, M7, M8, and M8a) based maximum

likelihood method were employed in CodeML program of PAML4.7 software to

detect the positive Darwinian selection in nine MCPH genes across three datasets i.e.,

primates, nonprimate placental mammals and placental mammals (Wong, et al., 2004;

Page 51: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 2 Materials and Methods

31

Z. Yang, 2007; Z. Yang, et al., 2000). Codon substitutions site models permit selective

pressure ω to vary across the codon sites of protein coding gene but prohibited ω to

vary across lineages. Patterns of selection in nine genes (STIL, ZNF335, SASS6,

PHC1, MFSD2A, CEP135, CDK6, CIT, and KIF14) were investigated on the above

mentioned three datasets by submitting a well-accepted phylogeny and alignment

(alignment by PRANK program) of respective datasets to the CodeML. Likelihood

ratio tests (LRTs) were calculated for three site pairs of models from log likelihood

scores of five codon substitutions site models M1, M2, M7, M8, and M8a to test sings

of positive selection. The first pair compare the null model M1 (nearly neutral model

that assume the existence of two classes of sites with ω = 1 and ω < 1) and alternative

model M2 (positive selection model that assume an additional third class of site with ω

> 1) (Wong, et al., 2004; Z. Yang, Wong, & Nielsen, 2005). The other two pairs are

null model M7 (beta) and alternative model M8 (beta, and ω2 > 1), and the last pair

comparison between null model M8a (beta and ω2 = 1) and alternative model M8

(beta, and ω2 > 1) (Swanson, Nielsen, & Yang, 2003; Z. Yang & Swanson, 2002). The

LRT values for three site pairs of models were calculated as follows:

LRT = 2(log likelihood score of alternative model – log likelihood score of null model)

The significance of these test are determined by calculating p values from LRT values

using Chi-square program of PAMLX 1.2 package (B. Xu & Yang, 2013). For this

study, positive selection inferred only if two out of three site pair models significantly

reject the null model in the favor of alternative models. Naïve Empirical Bayes (NEB)

and Bayes Empirical Bayes (BEB) methods implemented in M8 codon substitution site

model were used to identify positive selected sites by estimating the posterior

probability for site classes (Z. Yang, et al., 2005).

2.5.3 Codon substitutions Branch-site model

Positive selection acting on protein coding gene often transient or for a short period of

time and affects only a fraction of sites. The above site model was unable to detect this

type of transient and episodic positive selection. Branch-site approach of Zhang,

Nielsen and Yang implemented in CodeML was used to test signature of episodic

positive selection that was restricted to specific lineage/lineages (Jianzhi Zhang,

Nielsen, & Yang, 2005). This model allows ω to vary both across branches and sites.

The branch-site model allows that phylogeny can be divided into prespecified

Page 52: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 2 Materials and Methods

32

foreground branch (ω2 >= 1, proportion of sites may be under positive selection) and

background branches (where proportion of sites experienced either purifying selection

or neutral evolution 0 < ω2 <= 1). The inference of positive selection was conducted by

calculating LRT between this branch-site model and null model (it is same as branch

site model but with ω2 = 1 for foreground branch) with above mention LRT formula

(Jianzhi Zhang, et al., 2005). Eutherian multiple sequence alignment and well accepted

phylogeny used as input for the detection of episodic selection at different

evolutionary time point from primate ancestral branch to human terminal branch.

Branch-site test was performed, in specific in relation to prefrontal forebrain size

evolution.

2.5.4 Clade model C (CmC) analyses

It is not necessary that adaptive function evolved only if positive selection operating

on protein coding gene. Difference in selective pressure between the clades of

phylogeny can also responsible for the adaptive evolution of particular phenotype and

trait. When talked about particularly in context with brain evolution, cerebral cortex

size tends to start expanding in the ancestor of primates. To determine the pattern of

divergence in selective constraint in nine MCPH genes across the mammalian

phylogeny, clade model C (CmC) approach implemented in CodeML was used

(Bielawski & Yang, 2004). Clade model C assumes that proportions of sites have

evolved under divergent selective pressure but not necessarily under positive selection

in two or more partition of phylogeny defined a priory. The LRT was conducted by

comparing CmC model with null model M2a-rel with the same formula mentioned in

site model analyses (Weadick & Chang, 2011). Both alternative CmC and null M2a-rel

model have possessed three classes of sites with ω = 1 (neutral evolution), ω < 1

(negative selection). The third class of site in M2a-rel has single ω ratio (ω2 > 0 that

allows sites to be evolved adaptively) that is shared between all clades of phylogeny

while CmC third class of site has ω ratios equal to the partitions of the phylogeny (for

example if two phylogeny divided into two partitions then third class contain two ω

ratios ω2 > 0 and ω3 > 0 one for each partition) and varies among the partition of

phylogeny (Bielawski & Yang, 2004; Weadick & Chang, 2011). Mammalian multiple

sequence alignment and well established phylogeny were used as input for the

detection of divergent selective pressure between primates and nonprimate eutherian,

Page 53: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 2 Materials and Methods

33

simians and nonsimians eutherian, catarrhini and noncatarrhini eutherian, hominidae

and nonhominidae eutherian, and hominini and nonhominini eutherian.

2.6 Statiscal Analysis

P values were calculated for site, branch-site and CmC analyses from LRT values in

Chi-square program of PAMLX 1.2 package (B. Xu & Yang, 2013). The P values of

all codon based maximum likelihood methods were corrected for false discovery rate

by using q value package in R3.5.0 (Storey & Tibshirani, 2003; Team, 2018).

Bootstrap method was used for π0 estimation and specified fdr.level = 0.05 in q value

package in R3.5.0 (Storey, Taylor, & Siegmund, 2004).

Coding sequences of two MCPH genes (WDR62 and STIL) contain higher number of

nonsynonymous substitutions than synonymous substitutions in modern human as

compared to chimpanzee and these sequences are accelerated in modern human

terminal branch but not at significant level. To detect the accelerated segments, the

Sliding window analysis of Ka/Ks ratio was performed on human and chimpanzee

orthologous coding sequence of WDR62 and STIL. Ka-Ks was computed at the sliding

augmentation of 10 codons (30 nucleotides) and the upshots are acquired in the form

of graph drawn by the GNUPLOT software employed in SWAKK (Liang, Zhou, &

Landweber, 2006). The non-synonymous substitutions within positively selected

segments (Ka/Ks>1) are categorized according to their physicochemical properties by

using BLOSSUM 62 (J. Zhang, 2000). Further tests were conducted to identify

whether signature of positive selection is present in modern huamans.

2.7 Detecting selection at microevolutionary level

2.7.1 Sequence acquisiotn of human population data

Variation data of 1092 individuals from fourteen different modern human populations

(CHB:97; CHS:100; JPT:89; FIN:93; GBR:89; TSI:98; IBS:14; CEU:85; CLM:60;

MXL:66; PUR:55; ASW:6; LWK:97 and YRI:88) for WDR62 and STIL protein

coding genes was obtained from 1000 Genomes Project phase 1 data in variant call

format (VCF) (www.1000genomes.org) (Abecasis et al., 2012). Polymorphisms

among population, allele frequencies were manually calculated utilizing VCF files.

The topology of modern human populations tree was depicted in accordance with

Page 54: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 2 Materials and Methods

34

previously described data (McEvoy, Powell, Goddard, & Visscher, 2011). With the

sense of completion, HapMap and CEPH databases were scanned to gain insights

about the derived allele frequencies by exploiting the SPSmart webserver (Amigo,

Salas, Phillips, & Carracedo, 2008) . VCF-consensus perl script was used to obtained

nucleotide sequence of two MCPH genes (WDR62 and STIL) for 1092 individuals

from their respective VCF files through VCF tools in linux. Protein coding sequences

of WDR62 and STIL in 1092 individuals were predicted by using similarity based

programs FGENESH+ and FGENESH_C implemented in desktop centered MolQuest

Bioinformatics toolbox.

2.7.2 Frequency spectrum based method for natural selection

Positive selection causes advantageous allele to fix rapidly within a population. To

scrutinize whether the observed patterns of variability in WDR62 and STIL coding

sequences within human populations are congruent with the neutral model, classical

neutrality test Tajima‘s D, Fu and Li‘s D, D*, F and F* were performed on the panel

of validated coding SNPs present in 1092 individuals 1000 genome project phase 1 (Fu

& Li, 1993; Tajima, 1989). These classical neutrality tests were executed using the

DnaSP 5.10 (Librado & Rozas, 2009).

2.8 Molecular evolution of synuclein genes

2.8.1 Sequence and structure analysis of synuclein genes

Single Likelihood Ancestor Counting (SLAC) method implemented in Hyphy was

used to detect non-neutral evolution acting on each codon in vertebrate alignment of α

synuclein, by ASR and computing substitutions (Goldman & Yang, 1994). Impact of

the evolutionary alterations that have transpired during vertebrate with Ka/Ks < 1 are

categorized into neutral or radical change according to their physicochemical

properties (Betts & Russell, 2003; Grantham, 1974).

Domains and motifs have been allocated to human α synuclein as described previously

(Du et al., 2003; Uverskya & Finka, 2002). ClustalW2 based multiple sequence

alignments were used to map the putative positioning of these domain and motifs in

putative paralog of human α synuclein protein (Thomopson, Higgins, & Gibson,

1994).

Page 55: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 2 Materials and Methods

35

NMR structure of human α synuclein (1XQ8) was obtained from Protein Data Bank

(PDB) (Ulmer, Bax, Cole, & Nussbaum, 2005). Structures of β and γ synuclein

coupled with ancestral proteins structures of α synuclein from sarcopterygians to

placental and nonprimate placental mammals were modelled by Modeller (Webb &

Sali, 2014). Qualities of the modeled structures were investigated by Ramachandran

plot (Sheik, Sundararajan, Hussain, & Sekar, 2002). Superimposition of the modeled

structures with 1XQ8 was carried out with chimera and root mean square deviation

(RMSD) values were calculated (Pettersen et al., 2004). In order to inspect the

structural deviations in the human specific mutations i.e. A30P, E46K, H50Q, G51D,

A53T involved in Parkinson‘s disease, mutant models were also generated by

Modeller (Webb & Sali, 2014).

2.8.2 Estimation of functional divergence among synuclein genes

Gene duplications play a pivotal role in the functional diversity of proteins which

ultimately responsible for the adaptive evolution of specific phenotype or trait.

Functional divergence among synuclein paralogs was detected by using clade model D

implemented in CodeML program of PAML4.7 software (Bielawski & Yang, 2004; Z.

Yang, 2007). Clade model D assumes that proportions of sites have evolved under

divergent selective pressure but not necessarily under positive selection between the

paralogs (Bielawski & Yang, 2004). The significance of functional divergence among

paralogs was checked by conducting likelihood ratio test (LRT) of codon substitutions

site model M3 (that assume variation in selective pressure among sites but not across

the branches or paralogs) null model against the alternative clade model D (Bielawski

& Yang, 2004; Z. Yang, et al., 2000). Functional divergence among synuclein paralogs

was also observed by using DIVERGE (DetectIng Variability in Evolutionary Rates

among Genes) software that first detect site-specific evolutionary rate shift among

paralogs and then predict those amino acid residues responsible for functional

divergence based on posterior probability (Gu & Vander Velden, 2002).

2.8.3 Identification of coevolutionary relationship among residues within

gene

Mutual Information Server To Infer Coevolution (MISTIC) web server was used to

infer coevolutionary relationship among amino acid residues within protein for all

synuclein paralogs (Simonetti, Teppa, Chernomoretz, Nielsen, & Marino Buslje,

Page 56: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 2 Materials and Methods

36

2013). MISTIC identify coevolutionary relationship between residues based on mutual

information (MI), proximity mutual information (pMI) and cumulative mutual

information (cMI) score for individual residue (Buslje, Santos, Delfino, & Nielsen,

2009; Buslje, Teppa, Di Doménico, Delfino, & Nielsen, 2010). The coevolutionary

relationships among the residues within protein were visualized using Circos. The

vertebrate‘s orthologous multiple sequence alignments of synuclein proteins were

submitting as input to MISTIC for the identification of coevolutionary relationship

between residues.

Page 57: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

37

Results

3.1 Identification of candidate genes

By concentrating on specific loci candidate genes has the potential for extensive

phylogenetic and evolutionary analysis. Candidate genes have come from extensive

literature survey and focusing on only those genes that meet two conditions. First,

genes are involved in early brain development and second, impairment in the coding

sequence of those genes causes brain associated anomalies. Primary microcephaly

genes are excellent candidate genes in order to understand the evolutionary expansion

of brain size as primary microcephalic patients had reduced brain size similar to that of

early hominids (Woods, et al., 2005). In human, primary microcephaly is inherited in

recessive mode. Initially identified genes, ASPM and MCPH have been shown to

exhibit accelerated evolution in the lineage leading to human (Evans et al., 2004;

Wang & Su, 2004). In this study, ten newly identified genes WDR62, STIL, CEP135,

ZNF335, PHC1, CDK6, SASS6, MFSD2A, CIT and KIF4 are considered as candidate

genes for evolutionary analysis (Awad, et al., 2013; Basit, et al., 2016; Muhammad

Sajid Hussain, et al., 2012; Muhammad S Hussain, et al., 2013; Khan, et al., 2014;

Kumar, et al., 2009; H. Li, et al., 2016; Moawia, et al., 2017; Adeline K Nicholas, et

al., 2010; Shaheen, et al., 2016).

3.2 WD repeat domain 62 (WDR62)

3.2.1 Evolutionary history of MCPH2 gene WDR62

Phylogenetic tree for WDR62 and its putative paralogs was reconstructed using

neighbor joining (NJ) method in order to identify the origin and evolutionary

relationship between the WDR62 and its paralogs (Figure 3.12). NJ tree was

reconstructed with amino acid sequences from representative members of phyla

porifera, arthropoda, hemichordata, cephalochordata, and vertebrata. Phylogenetic tree

revealed that two duplication events were responsible for the expansion of this family

(Figure 3.1). First duplication event has eventuated during the early metazoan history,

before bilaterian-nonbilaterian divergence and produced most ancient member of this

family WDR16 gene and WDR62/MAPKBP1 ancestral gene (Figure 3.1). Second

duplication event diverged WDR62 and MAPKBP1 and has occurred prior to

Page 58: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

38

actinopterygii-sarcopterygii split and after the divergence of vertebrates from

cephalochordate with 100% bootstrap score (Figure 3.1). From the tree topology

pattern it appears that subfamily encompasses WDR62 and MAPKBP1 genes is very

distantly related to most ancient paralog of this subfamily WDR16 (Figure 3.1). The

phylogeny confirms the presence of human WDR62 orthologs in all the five main

classes of vertebrates, i.e., teleostei, amphibia, reptilia, aves, and mammalia (Figure

3.1).

Figure 3. 1: Evolutionary history of MCPH2 gene WDR62.

Page 59: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

39

The evolutionary history of human WDR62 and its putative paralogs was inferred through NJ method

based on evolutionary distances computed by uncorrected p distance based method. Fifty seven protein

sequences were used in this analysis. All positions that contain gaps and missing data were eradicated

prior to phylogenetic tree reconstruction. The numbers on the internal branches represent bootstrap

score. The bootstrap score greater than and equal to 50% are displayed on the nodes only. Scale bar

depicts number of amino acid substitution per site.

3.2.2 Molecular evolution of WDR62 in mammals

In order to identify the lineage specific Ka/Ks ratio, phylogenetic tree was

reconstructed with WDR62 orthologous coding sequences of representative primates

and non-primate mammalian species (Figure 3.2). The ratio of non-silent replacements

to silent replacements was determined for every external and internal branch of

phylogenetic tree. This revealed that non-synonymous substitutions outnumber the

synonymous substitutions only in human terminal branch (Ka/Ks=1.31). In contrast to

human terminal branch, all other internal and terminal branches, the synonymous

substitutions outnumber the non-synonymous substitutions and are suggestive of

purifying selection (Figure 3.2). From this analysis it appears that within mammals, the

evolution of WDR62 is accelerated particularly in human terminal branch after it

diverged from pan lineage.

Figure 3. 2: Estimation of WDR62 sequence evolution in therian.

Page 60: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

40

Ka/Ks ratio for every internal and external branch of therian phylogeny was estimated and is shown

above each branch. The human terminal branch Ka/Ks score is highlighted in bold.

3.2.3 Human polymorphisms and signatures of selection

Molecular evolutionary rate analysis within mammals divulged variation in sequence

rate of WDR62 evolution between recently diverged Homo sapiens and Pan

troglodytes. Human WDR62 evolving slightly at higher rate (Ka/Ks = 1.31) as

compared to chimpanzee WDR62 (Ka/Ks = 0.844) and hence reject neutrality. To

investigate whether WDR62 sequence variation within humans is congruent with the

neutrality theory, the genetic diversity was estimated by using human genetic

polymorphic data of WDR62 gene obtained from dbSNP build-137 (Sherry et al.,

2001). Different classical neutrality tests i.e. Fu and Li‘s D, D* (without outgroup), F,

F* (without outgroup) and Tajima‘s D were implemented on coding sequence

polymorphisms only in order to detect the departure from neutrality (Fu & Li, 1993;

Tajima, 1989). Results unveiled that nucleotide polymorphism θw 0.00129 per site is

not equivalent to heterozygosity π 0.00043 per site, indicating that neutrality is

rejected. Nucleotide diversity within WDR62 coding sequence (0.00043) is smaller as

compared to the nucleotide diversity of human genome () and chromosome 19

(0.000764) (Sachidanandam et al., 2001). All the above mentioned classical neutrality

tests have significant negative values (Tajima‘s D = -2.51, P < 0.001; Fu and Li‘s D =

-3.95, P < 0.02; Fu and Li‘s D* = -4.0, P < 0.02, Fu and Li‘s F = -4.21, P < 0.02; Fu

and Li‘s F* = -4.14, P < 0.02) and reject neutral model. Difference in heterozygosity

and nucleotide polymorphism and significant negative values likely be a consequence

of demographic history and natural selection.

3.2.4 SWAKK analysis of WDR62

SWAKK (sliding window analysis of Ka/Ks) was employed on human and

chimpanzee WDR62 orthologous sequences in order to pinpoint WDR62 protein

regions that have been accelerated in the recent history of human after its divergence

from chimpanzee. Accelerated region might have implications in the functional

modification of human WDR62.

SWAKK graph identified six regions (R1-R6) in human WDR62 where Ka/Ks

exceeds one and congruent with positive selection (Figure 3.3). Remaining portion of

human WDR62 is under strong purifying selection (Figure 3.3). The non-silent

Page 61: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

41

replacements within six regions R1-R6 are classified according to their

physicochemical properties and impact on WDR62 structure (Table 3.1). It appears

from ML ancestral sequence reconstruction since divergence from hominini ancestor,

eight and nine replacements are accumulated in human and chimpanzee WDR62

protein respectively. Comparative study of these replacements with ancestral sequence

revealed that 5/8 (62%), 5/9 (55%) substitutions in human and chimpanzee likely have

some implications in the structural and functional modification of protein (Table 3.1).

Chimpanzee comprehends one neutral and radical substitution in R1 and R2

respectively within uncharacterized portion of the protein. Human encompasses one

neutral and two radical alterations in R3 and R4 within uncharacterized portion of

WDR62. One neutral and one radical substitution were observed in R5 within an

uncharacterized region and MKK7β1 binding domain (MB) of chimpanzee WDR62.

Captivatingly, R6 incorporates more evolutionary changes in contrast to whole protein.

Two neutral and three radical substitutions were found in R6 both in human and

chimpanzee, within the proline rich domain and loop helix domain (Figure 3.3 and

Table 3.1) (Pervaiz & Abbasi, 2016).

Figure 3. 3: SWAKK plot of human and chimpanzee WDR62.

SWAKK plot display six region R1 to R6 (above dotted line) where rate of sequence evolution is

accelerated, indicate higher number of non-silent substitutions over neutral expectation i.e., Ka-Ks>0.

Dotted line depict Ka-Ks = 0.

Page 62: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

42

Table 3. 1: Amino acids substitutions in human and chimpanzee lineage since the divergence from

hominini ancestor.

Ka/Ks > 1 Position Hominini

residue

Substitution in

Chimpanzee

Substitution in

Human

Neutral/Radical

Region 1

81 G S Neutral (0)

Region 2

393 R G Radical (-2)

Region 3

790 R H Neutral (0)

850 S L Radical (-2)

Region 4

1091 Y H Radical (2)

Region 5

1169 R H Neutral (0)

1273 T P Radical (-1)

Region 6

1304 V A Neutral (0)

1310 L Q Radical (-2)

1336 A T Neutral (0)

1345 R H Neutral (0)

1369 G R Radical (-2)

1372 V I Radical (3)

1390 F L Neutral (0)

1408 P S Radical (-1)

1458 R Q Radical (1)

1489 S T Radical (1)

Putative ancestral residues are constructed using maximum likelihood method. Physicochemical impact and

log-odd score in brackets for each amino acid replacement are illustrated in last column. Positive numbers

depicts a preferred replacement, negative numbers depicts an un-preferred replacement, and zero depicts a

neutral replacement.

3.2.5 Comparative analysis of WDR62 with archaic humans and modern

human populations

Comparative protein sequence analysis of human WDR62 with various non-human

primates revealed eight human specific amino acid replacements (Table 3.1). In order

to determine how many of human specific amino acid changes shared with archaic

humans (Neandertals and Denisovans) and how many of them are specific to modern

humans, we compare the human WDR62 protein sequence with two archaic humans

(Neandertals and Denisovans). This analysis revealed that extinct archaic humans, the

Neandertals and Denisovans, share six amino acid replacements R790H, S850L,

Y1091H, V1304A, G1369R, and V1372I with anatomically modern humans (hominin

specific replacements). Two replacements L1310Q, F1390L are specific to modern

humans, whereas in these sites archaic humans contain human-primate ancestral alleles

(Figure 3.4a).

Page 63: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

43

Figure 3. 4: Comparative analysis of WDR62 among human populations.

a) Tree shows the previously well-defined relationship between various modern human populations and

archaic humans by using chimpanzee as outgroup (See Materials and Methods). Tree illustrates six

hominin specific amino acid substitutions from which five are fixed in modern human populations, while

the remaining one is polymorphic in modern humans. Two amino acid substitutions are unique to

modern humans and are not being shared with archaic humans. These two amino acid sites are

polymorphic in modern human populations. Comparative view of modern human specific and hominin

specific amino acid substitutions is illustrated on right side of the tree in modern human populations,

archaic humans and chimpanzee. Human reference sequence (GRCh 37) is color coded in red. YRI;

Yoruba in Ibadan Nigeria, LWK; Luhya in webuyo, Kenya, ASW; People with africans ancestry in

southwest united states, FIN; Finnish in Finland, GBR; British from England and Scotland UK, CEU;

Uttah residents with ancestry from northern and western Europe, TSI; Toscani in Italia, IBS; Iberian population in Spain, CLM; Colombians in Medellin, Colombia, PUR; Puetro Ricans in Puetro Rico,

MXL; People with Mexican ancestry in Los Angeles, JPT; Japanese in Tokyo, Japan, CHS; Han

Chinese south China, and CHB; Han Chinese in Beijing, China. Three polymorphic variations are

Page 64: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

44

further investigated in 1000 Genomes Project, HapMap release 28 and CEPH Stanford HGDP data by

SPSmart webserver. b) Derived allele frequency of SNP rs2285745 (S850L) among modern human

populations in above mentioned human genomes variation projects show relatively high derived allele

frequency in Oceania and Asia and low in Africa. c) SNP rs2074435 (L1310Q) show high derived allele

frequency in European population as compared to Africans and Asians. American population for this

SNP is not genotyped by HapMap project as depicted in graph. d) SNP rs1008328 (F1390L) demonstrated high derived allele frequency in Africa and low in Asia.

Furthermore, in order to gain insight into the status of six hominin specific and two

modern human specific amino acid replacements in modern human populations, we

exploited the populations‘ variation data from 1000 Genomes Project (Abecasis, et al.,

2012). These data show that, among six hominin specific amino acid replacements,

five amino acid changes (R790H, Y1091H, V1304A, G1369R and V1372I) are fixed

in modern human population. While the remaining one hominin specific (S850L) and

two modern human specific replacements (L1310Q and F1390L) are polymorphic in

modern human populations (Figure 3.4a). These three polymorphic sites were also

examined in HapMap data (International HapMap et al., 2010) and CEPH Stanford

HGDP data (http://spsmart.cesga.es/). Combine analysis of 1000 Genomes Projects,

HapMap data and CEPH Stanford HGDP data shows that out of three polymorphic

sites, one variant S850L (human specific site shared with archaic humans), is present

at relatively high derived allele frequency in non-African populations, particularly in

Oceanian and Asian populations as compared to African populations (Figure 3.4b).

The other two polymorphic sites located in exon 30, L1310Q and F1390L show high

derived allele frequency in European and African populations respectively (Figure 3.4c

and 3.4d).

3.3 SCL/TAL1 interrupting locus (STIL)

3.3.1 Evolutionary history of STIL

The paralog of human STIL gene was not identified in any public database and by

similarity search tools. The orthologs of human STIL gene were identified in

representative members of phyla vertebrata, hemichordata, annelida, Mollusca,

cnidarian, and porifera (Figure 3.5). The phylogenetic analysis of STIL gene revealed

that it is originated at the root of metazoan (Figure 3.5). However, ortholog of human

STIL from Ciona intestinalis, Drosophila melanogaster, and Caenorhabditis elegan

lack evident sequence homology and share only two regions in carboxyl terminal

region. However, bidirectional blast hit strategy was not identified true ortholog of

Page 65: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

45

human STIL gene from these three organisms. Because the genomes of these

traditional model invertebrates like Ciona intestinalis, Drosophila melanogaster, and

Caenorhabditis elegan, have experienced extensive changes in gene contents, gene

architecture and are highly rearranged.

Figure 3. 5: Phylogenetic analysis of STIL gene.

Page 66: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

46

Evolutionary history of STIL was inferred by NJ (neighbor joining) method. Evolutionary distance was

calculated by uncorrected p distance method. All positions that contain gaps and missing data are

removed. The statistics denoted on the nodes represent bootstrap value. Bootstrap values greater than

and equal to 50 is shown here

3.3.2 Molecular Evolution of STIL in Mammals by Site Models

Maximum likelihood codon substitution site models were used to detect the selective

pressure on STIL locus in primates, nonprimate mammals and placental mammals.

These codon substitution site models assume variable selective pressure among sites

(amino acids) in the protein sequence. These models differ from each other in terms of

ω distribution and numbers of free parameters. We estimated the log likelihood values

for six different codon substitution sites models using CodeML package implemented

in PAML, in order to compute likelihood ratio test (LRT). The LRT values were

obtained by the comparison between null models that does not allow the ω to exceed 1

(M1, M7, and M8a) and alternative models that allow the ω value to exceed 1 (M2 and

M8). Positive selection inferred only if two out of three site pair comparisons (M1/M2,

M7/M8, and M8a/M8) were significant. After the gap removal, 3663 sites were

analyzed in primates. For primates, the LRT values were not significant in

comparisons M1/M2, M7/M8, and M8a/M8, indicated that no signature of positive

selection was found in primates (Table 3.2). In nonprimate mammals, 3426 sites were

analyzed after elimination of the gaps. The LRT values were significant for M7 vs M8,

and M8a vs M8 comparisons, indicated that signatures of positive selection were found

in nonprimate mammals (Table 3.2). Conversely, positive selection was not identified

by M2 selection model probably M2 is too conservative (Table 3.2). However, I were

unable to identify single significant positive selected site with probability >= 95% in

selection model M8 (Table 3.2). In placental mammals (combined data of primates and

nonprimate mammals), 3228 sites were analyzed. The two null models M7 and M8a

were significantly spurned in the favor of alternative model M8 which suggests

positive selection in placental mammals (Table 3.2). However, only one site was

detected as positive selected site with p value 0.05 by NEB and BEB methods (Table

3.2).

Page 67: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

47

Table 3. 2: Parameter estimation and LRT for Mammals STIL.

Data group Model Parameter estimation (ω) Log likelihood

value (lnL)

P

value

q value

Primates M0: One ratio ω = 0.42728 -10963.86677

M1: Nearly neutral ω0 = 0.116, f0 = 0.6164, ω1 =

1.0, f1 = 0.38358

-10916.74197 0.92 1.0

M2:Positive

selection

ω0 = 0.1186, f0 = 0.6189, ω1

= 1.0, f1= 0.379, ω2 = 3.6, f2

= 0.0019

-10916.66704

M7: p = 0.2246, q = 0.2764 -10917.05514 0.55 0.51

M8: p0 = 0.94839, p = 0.3461 q =

0.5367, (p1 = 0.0516) ω =

1.16628

-10916.45545

M8a: p0 = 0.83646, p = 0.2541, q

= 0.4899, (p1 = 0.1635) ω

= 1.0

-10916.974231 0.31 0.39

Nonprimate M0: One ratio ω = 0.34063 -22972.35103

mammals M1: Nearly neutral ω0 = 0.1324, f0 =0.675, ω1 =

1.0, f1 = 0.32

-22630.38639 1.0 1.0

M2:Positive

selection

ω0 = 0.1324, f0 = 0.68, ω1 =

1.0, f1 = 0.283, ω2 = 1.0, f2 =

0.042

-22630.38639

M7: p = 0.42 q = 0.759 -22606.97892 0.0001 0.0002

M8: p0 = 0.975, p = 0.475, q =

0.945(p1 = 0.0247) ω = 1.89

-22597.841266

M8a: p0 = 0.8214, p = 0.602, q =

1.984, (p1 = 0.1786) ω =

1.0

-22602.375609 0.003 0.007

Mammals M0: One ratio ω = 0.363 -27276.3467

M1: Nearly neutral ω0 = 0.139, f0 = 0.646, ω1 =

1.0, f1 = 0.353

-26831.009 0.99 1.0

M2:Positive

selection

ω0 = 0.139, f0 = 0.646, ω1 =

1.0, f1 = 0.284, ω2 = 1.0, f2=

0.0696

-26831.0100

M7: p = 0.4307, q = 0.7177 -26786.5586 1⁎10-3 4⁎10-3

M8: p0 = 0.959, p = 0.496, q =

0.947

(p1 = 0.0402) ω = 1.613

(412P 0.958 NEB, 412P

0.958 BEB)

-26775.2467

M8a: p0 = 0.81227, p = 0.60291, q

= 1.8573, (p1 = 0.18773) ω

= 1.0

-26781.3313 0.0005 0.003

3.3.3 Episodic selection at various stages of primate evolution in STIL locus

Positive selection was generally occurred for short period of evolutionary time and

affects only few sites rather than the entire protein sequence. Branch-site codon

substitution method was used to identify the episodic positive selection on individual

codon at particular evolutionary stages from ancestral primate branch to human

terminal branch (Table 3.3). Significance of this test was determined by comparing

Page 68: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

48

this model by null model which is similar to branch-site model but ω2 fixed to 1 for

foreground branch. The branch-site method revealed that 0.04-0.4% of STIL coding

sequence predicted to be evolving at accelerated rate in hominidea ancestral branch

and human terminal branch respectively (Table 3.3). However, the LRT statistics were

not significant for aforementioned branches (Table 3.3).

Table 3. 3: Branch-site analysis of STIL.

Foreground

branch

ω2 LRT p value q value Positive

selected sites

Human 3.13 0.1646 0.68 0.98

Hominini 1 0 1 0.98

Hominidae 1 0.0917 0.76 0.98

Hominidea 4.72 0.2159 0.64 0.98

Catarrhini 1 0.0689 0.79 0.98

Simians 1 0 1 0.98

Haplorhini 1 0.00008 0.99 0.98

Primates 1 0.000002 1 0.98

3.3.4 Divergent selection pressure between clades of mammals for STIL

locus

Divergence in protein function among clades can result in site specific variation in

selective pressure among clades. Positive selection is not necessarily required for

protein functional diversification among clades which eventually contribute to

adaptive phenotypic diversity. The signatures of functional divergence among different

partition of mammalian phylogeny for STIL locus were identified by codon

substitution clade model C (CmC). The significance of functional diversification

between different partitions of phylogeny was determined by comparing CmC and

M2a_rel null model of Weadick and Chang. The parameter estimation revealed 34% of

sites significantly evolved at divergent selective pressure between simians (ω3 = 0.138)

and nonsimians placental mammals (ω2 = 0.019) (Table 3.4). Parameters estimation

also showed functional divergence between hominini and nonhominini placental

mammals but p value correction by false discovery rate q value exposed this as a false

positive result (Table 3.4).

Page 69: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

49

Table 3. 4: Divergent selection constraint parameters estimation and likelihood scores for STIL.

Model & Partition Parameter estimation (ω) Log likelihood

value (lnL)

P value q value

CmC-Primate p0 = 0.449, ω0 = 0.339, p1= 0.223, ω1 =

1.0, p2 = 0.327, ωnp = 0.020, ωp = 0.041

-26778.0127

0.24

0.28

CmC-Simians p0 = 0.437, ω0 = 0.347, p1= 0.221, ω1 =

1.0, p2 = 0.342, ωns = 0.019, ωs = 0.138

-26771.3217 0.0001 0.002

CmC-catarrhini p0 = 0.436, ω0 = 0.35, p1= 0.22, ω1 = 1.0,

p2 = 0.344, ωnc = 0.027, ωc = 0.112

-26777.1301

0.076 0.42

CmC-greatapes p0= 0.46, ω0 = 0.33, p1= 0.23, ω1 = 1.0, p2 = 0.32, ωng = 0.021, ωg = 0.019

-26778.6965

0.93 0.80

CmC-hominini p0 = 0.32, ω0 = 0.021, p1= 0.23, ω1 = 1.0,

p2 = 0.46, ωnh = 0.33, ωh = 0.77

-26776.4205

0.03 0.30

M2a_rel p0 = 0.46, ω0 = 0.332, p1= 0.225, ω1 =

1.0, p2 = 0.32, ω2 = 0.213

-26778.7002

NA NA

np: nonprimate eutherian, p: primates, ns: nonsimians eutherian, s: simians, nc: noncatarrhini, c:

catarrhini, ng: nongreatapes eutherian, g: greatapes, nh: nonhominini eutherian, h: hominini.

3.3.5 Human polymorphisms and signatures of selection

Molecular evolutionary rate analysis revealed human STIL evolving at accelerated rate

(ω = 3.13) than its orthologous copy in ancestral hominini branch but not significant

(Table 3.3). To investigate whether the variation in the sequence of STIL within

human populations is congruent with neutrality, the genetic diversity was estimated by

using 1000 genome phase 1 data of 1092 humans from diverse ethnic group. Different

classical neutrality tests i.e., Fu and Li‘s D, D* (without outgroup), F, F* (without

outgroup) and Tajima‘s D (Fu & Li, 1993; Tajima, 1989) were employed on

polymorphisms located in coding sequence of 1092 individuals from different ethnic

group (Table 3.5). Results revealed that heterozygosity π (nucleotide diversity) for

STIL coding sequence in human is 0.00049 per site that is smaller as compared to the

nucleotide diversity whole human genome (0.000751) and chromosome 1 (0.000772)

(Sachidanandam, et al., 2001). All the aforementioned neutrality tests have significant

negative values (Table 3.5). Low heterozygosity and negative values of neutrality tests

significantly reject neutrality hypothesis and might indicate natural selection or

population expansion.

Page 70: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

50

Table 3. 5: Tests for departure from neutrality through population’s variation data (1000

genome).

Test Statistics P value

With Chimpanzee as an outgroup

Tajima‘s D -2.75469 <0.001

Fu and Li‘s D -5.74963 <0.02

Fu and Li‘s F -5.59888 <0.02

Without an outgroup

Fu and Li‘s D* -6.15193 <0.02

Fu and Li‘s F* -4.98206 <0.02

P indicates the ‗probability value‘ that demonstrates the departure from null hypothesis (neutrality)

3.3.6 SWAKK analysis of STIL

SWAKK (sliding window analysis of Ka/Ks) analysis was employed on human and

chimpanzee coding sequences in order to pinpoint human STIL protein regions that

have been accelerated during recent history after its divergence from chimpanzee.

SWAKK graph indicated nine regions (R1-R9) where Ka-Ks difference exceeds one

and congruent with the pattern of positive selection (Figure 3.6). It further indicated

that rest of the portion of protein is under strong selective constraint (Figure 3.6). The

non-silent substitutions in R1 to R9 are classified according their predicted

physicochemical properties and impact on structure of STIL protein (Table 3.6). It

appears from ML ancestral sequence reconstruction that six substitution fixed

chimpanzee STIL protein and fourteen substitutions in humans since the divergence

from hominini ancestor (Table 3.6). Comparative analysis with inferred hominini

ancestor divulged that 10/14 substitutions in humans and 6/6 substitutions in

chimpanzee might have been implicated in structural and functional modification of

STIL protein (Table 3.6).Regions R1 to R4 experienced two neutral and five radical

changes in human (Table 3.6). R5 and R7 comprehend two radical substitutions within

chimpanzee and human protein (Table 3.6). Captivatingly, R6 contains more

evolutionary substitutions in contrast to other regions. R6 accomplished by two radical

changes in chimpanzee, and one neutral and two radical substitutions in human (Figure

3.6 and Table 3.6). Human contains one radical and one neutral change in R8 and R9

respectively. Whereas, chimpanzee comprehends two radical substitutions in R9

(Table 3.6) Thus, this investigation not only determined the amino acid substitutions

Page 71: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

51

that have transpired independently in chimpanzee and human STIL protein since the

divergence from hominini ancestor approximately six to seven million years ago, but

also discriminate the substitutions that might have diminutive or no impact on the

structure/function of STIL and the ones that have probably involved in modifying the

structure/function of STIL during the last 6-7 million years in the course of

chimpanzee and human evolution.

Figure 3. 6: Sliding window analysis of STIL.

SWAKK graph displayed nine region R1-R9 where higher rate of non-silent substitutions have occurred

over neutral expectations i.e., Ka-Ks > 0.The dotted line depicts neutrality Ka-Ks = 0. Regions below

the dotted line indicate purifying selection Ka-Ks < 0.

Page 72: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

52

Table 3. 6: Human and chimpanzee specific substitutions in STIL after the divergence from

hominini ancestor.

Ka-Ks > 0 Position Hominini

residue

Substitution in

Chimpanzee

Substitution in

Human

Neutral/Radical

Region-1

86 V A Neutral (0)

Region-2

268 I T Radical (-1)

289 R Q Neutral (0)

Region-3

385 P S Radical (-1)

Region-4

511 Y C Radical (-2)

522 V I Radical (3)

594 P L Radical (-3)

Region-5

616 P L Radical (-3)

672 D G Radical (-1)

Region-6

750 T M Radical (-1)

751 P T Radical (-1)

769 M T Radical (-1)

787 S G Neutral (0)

813 L M Radical (2)

Region-7

917 G E Radical (-2)

980 T R Radical (-1)

Region-8

1152 H Y Radical (2)

Region-9

1250 I V Radical (3) 1251 A T Neutral (0) 1262 T M Radical (-1)

Putative ancestral residues are constructed using maximum likelihood method. Physicochemical impact

and log-odd score in brackets for each amino acid replacement are illustrated in last column. Positive

numbers depicts a preferred replacement, negative numbers depicts an un-preferred replacement, and

zero depicts a neutral replacement.

3.3.7 Comparative analysis of STIL with archaic humans and modern

human populations

Comparative protein sequence analysis of human STIL with various non-human

placental mammals revealed fourteen human specific amino acid substitutions (Table

3.6). Furthermore, comparative protein sequence analysis was extended by comparing

the human STIL protein sequence with two archaic humans (Neandertals and

Denisovans) in order to determine how many of human specific amino acid changes

shared with archaic humans (Neandertals and Denisovans) and how many of them are

specific to anatomically modern humans. This analysis revealed that extinct archaic

humans, the Neandertals and Denisovans, share thirteen amino acid changes (I268T,

Page 73: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

53

R289Q, P385S, Y511C, V522I, P594L, D672G, T750M, S787G and L813M) with

anatomically modern humans (hominin specific replacements). Only one substitution

V86A is specific to anatomically modern humans, whereas in this site archaic humans

contain hominini ancestral alleles.

Furthermore, in order to gain insight into the status of thirteen hominin specific and

one modern human specific amino acid substitutions in modern human populations,

we exploited the populations‘ variation data from 1000 Genomes Project phase1

(Abecasis, et al., 2012). These data show that, all hominin specific amino acid

replacements are fixed in modern human population. While the one modern human

specific (V86A) is polymorphic (rs3125630) in modern human populations.

3.4 Centrosomal Protein 135 (CEP135)

3.4.1 Evolutionary history of CEP135

Evolutionary history of CEP135 and its putative paralog TSGA10 (Testis specific 10)

was scrutinized through distance based neighbor joining (NJ) method (Figure 3.7).

Phylogenetic analysis revealed that human CEP135 paralogs originated by one

duplication event. The gene duplication event that split CEP135 and TSGA10 has

ocurred atleast prior to separation of chondrichthyes (cartilaginous fish) from

osteichthyes (bony vertebrates) and after vertebrates-invertebrates split (Figure 3.7).

Furthermore, phylogeny divulged that CEP135/TSGA10 putative ortholog was

originated during the earliest metazoan history (Figure 3.7). Bidirectional Blast hit

strategy was failed to identify any putative orthologs of CEP135 and TSGA10 in phyla

cephalochordata, nematode, arthropoda, cnidaria and placozoa.

Page 74: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

54

Figure 3. 7: Evolutionary history of MCPH8 gene CEP135

Phylogenetic tree of CEP135 and its putative paralog was inferred using NJ method by implying JTT amino acid substitution model to compute evolutionary distances. All positions that contain gaps and

missing data are removed prior to tree reconstruction. Thirty eight amino acid sequences were used in

this analysis. The statistics present on the nodes represent bootstrap values that were estimated on the

basis of 500 pseudoreplicates. Bootstrap score greater than and equal to 50 were displayed on the

nodes only.

Page 75: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

55

3.4.2 Estimation of pervasive signals of positive selection in CEP135 during

placental mammals

Three pair of site models (M1 & M2, M7 & M8, and M8a & M8) based on codon

substitutions were performed in order to examine whether the signals of positive

selection have operated on primates (18 species), nonprimate placental mammals (24

species) and all placental mammals (42 species) (Table 3.7). These site models

assume variable selective pressure ω among the amino acid sites of protein coding

gene across all species of the phylogeny. The signals of positive selection are

deliberated optimal if two out of three null models (M1, M7, and M8a) rejected in the

favor of more complex alternative model (M2 and M8). The one ratio site model (M0)

was also performed for all three datasets (primates, nonprimate placental mammals

and placental mammals) that revealed purifying selection dominated the evolution of

CEP135 throughout the eutherian with ω value ranging from 0.18-0.217(Table 3.7).

Parameter estimations and p value indicated that signals of positive selection were

found in primates and placental mammals with ω value 1.5 and 1.14 respectively but

by only one site pair model M7 & M8 (Table 3.7). Naïve empirical Bayes (NEB)

method implemented in M8 codon substitutions site model was pinpointed three and

twenty two positive selected sites in primates and placental mammals respectively

(Table 3.7).On the other hand, the LRT of M7 & M8 site pair model is less accurate

and yielded more false positive as compared to M8a & M8 and M1 & M2 site pairs

that‘s why above mentioned stringent criteria for positive selection was necessary for

this study. However, overall the results of site models suggested that no significant

signals of positive selection were found across the evolution of eutherian for CEP135

protein coding gene (Table 3.7).

3.4.3 Signature of positive selection by branch-site model

Branch-site model using codon based maximum likelihood method was performed to

test for episodic positive selection having acted on specific stages of evolution from

primate ancestral branch to modern human lineage (Table 3.8). Branch-site model

assume selective pressure ω to vary both across the branches and among the sites of

prespecified lineage. The significance of episodic selection was determined by

estimated the LRTs of null model (that is similar to branch-site model except ω2 = 1)

against the alternative branch-site model. Parameter estimations showed that sequence

accelerated at higher in primate ancestral branch with ω2 = 2.40 which suggest positive

Page 76: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

56

selection but the LRTs values indicated that no significant signals of episodic

selections were found on the analyzed branches of CEP135 phylogeny (Table 3.8).

Table 3. 7: Selective pressure estimation and LRT for Mammals CEP135.

Data group Model Parameter estimation (ω) Log likelihood

value (lnL)

P

value

q

value

Primates M0: One ratio ω = 0.21740 -10024.701547

M1: Nearly neutral ω0 = 0.07502, f0 = 0.81152, ω1

= 1.0, f1 = 0.18848

-9931.080961

0.58 1.0

M2:Positive

selection

ω0 = 0.08085, f0 = 0.82218, ω1

= 1.0, f1 = 0.1612, ω2 =

1.9284, f2 = 0.01665

-9930.537778

M7: p = 0.17565, q = 0.54351 -9934.038071 0.02 0.03

M8: p0 = 0.91865, p = 0.43353, q

= 2.41422, (p1 = 0.08135) ω =

1.50001 (10A, 598S, 1139M

NEB)

-9929.996489

M8a: p0 = 0.8167, p = 2.1971, q =

24.586, (p1 = 0.1833) ω =

1.0

-9931.075444 0.14 0.20

Nonprimate M0: One ratio ω = 0.184 -21043.22926

mammals M1: Nearly neutral ω0 = 0.068, f0 = 0.7883, ω1 =

1.0, f1 = 0.212

-20524.39690 1.0 1.0

M2:Positive

selection

ω0 = 0.0681, f0 = 0.7883, ω1 =

1.0, f1 = 01583, ω2 = 1.0, f2 =

0.0534

-20524.39690

M7: p = 0.2456, q = 0.8699 -20506.62095 3⁎10-4 1⁎10-3

M8: p0 = 0.8674, p = 0.432, q =

3.332(p1 = 0.1326) ω = 1.0

-20494.089869

M8a: p0 = 0.8562, p = 0.4137, q =

3.1877, (p1 = 0.14379) ω =

1.0

-20494.927683 0.20 0.24

Mammals M0: One ratio ω = 0.19050 -26901.5283

M1: Nearly neutral ω0 = 0.071, f0 = 0.776, ω1 =

1.0, f1 = 0.224

-26156.4468 1.0 1.0

M2:Positive

selection

ω0 = 0.071, f0 = 0.776, ω1 =

1.0, f1 = 0.168, ω2 = 1.0, f2 =

0.055

-26156.4468

M7: p = 0.253, q = 0.859 -26100.3429 2⁎10-9

4⁎10-8

M8: p0 = 0.899, p = 0.385, q =

2.37035, (p1 = 0.101) ω =

1.1404 (54S, 213Q, 221Q,

245L, 406S, 409L, 482P,

483P, 508R, 546S, 557S,

597N, 599V, 756L, 769V, 776L, 784T, 800S, 997S,

1004V, 1093N, 1130V NEB)

-26080.1784

M8a: p0 = 0.8641, p = 0.39485, q =

2.78179, (p1 = 0.13588) ω =

1.0

-26081.370241 0.12 0.19

Page 77: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

57

Table 3. 8: Branch-site analysis of CEP135.

Branch ω2 LRT P value q value Positive selected

sites

Human 1 0.0003 0.98 0.98

Hominini 1 0 1 0.98

Homininae 1 0.0001 0.99 0.98

Hominidae 1 0.00002 1 0.98

Hominidea 1 0.00006 0.99 0.98

Catarrhini 1 0.000004 1 0.98

Simians 1 0.000002 1 0.98 Haplorhini 1 0 1 0.98

Primates 2.40 0.0488 0.83 0.98

3.4.4 Divergent selective pressure across CEP135 mammalian phylogeny

Clade model C (CmC) using codon based maximum likelihood method was performed

to estimate site-specific variable selective pressure between partitions of CEP135

mammalian phylogeny (Table 3.9). The significance of divergent selective constraint

between partitions of phylogeny was determined by estimated the LRTs of M2a_rel

null model against CmC-primates, simians, catarrhini, greatapes and hominini

partitions (Table 3.9). Parameter estimations and LRTs showed that no signatures of

divergent selection were found in any of analyzed partitions of CEP135 mammalian

phylogeny (Table 3.9). These observations suggest that function of CEP135 protein

coding gene was conserved throughout the eutherian evolution.

Table 3. 9: Divergent selection constraint parameters estimation and likelihood scores for

CEP135.

Model Parameter estimation (ω) Log likelihood

value (lnL)

P value q value

CmC-Primate

p0 = 0.560, ω0 = 0.0231, p1= 0.142, ω1 = 1.0,

p2 = 0.298, ωnp = 0.251, ωp = 0.315

-26080.0893 0.057 0.34

CmC-Simians p0 = 0.296, ω0 = 0.27, p1= 0.141, ω1 = 1.0,

p2 = 0.56, ωns = 0.023, ωs = 0.021

-26081.8471 0.75 0.66

CmC-catarrhini p0 = 0.293, ω0 = 0.271, p1= 0.141, ω1 = 1.0, p2 = 0.57, ωnc = 0.023, ωc = 0.0397

-26081.3823

0.31 0.58

CmC-greatapes p0 = 0.295, ω0 = 0.269, p1= 0.141, ω1 = 1.0,

p2 = 0.564, ωng = 0.0235, ωg = 0.0184

-26081.8801

0.85 0.66

CmC-hominini

p0 = 0.56, ω0 = 0.023, p1= 0.141, ω1 = 1.0,

p2 = 0.296, ωnh = 0.27, ωh = 0.59

-26081.1383

0.22 0.57

M2a_rel

p0 = 0.295, ω0 = 0.269, p1 = 0.0141, ω1 = 1.0,

p2 = 0.564, ω2 = 0.0235

-26081.8982 NA NA

np: nonprimate eutherian, p: primates, ns: nonsimians eutherian, s: simians, nc: noncatarrhini, c:

catarrhini, ng: nongreatapes eutherian, g: greatapes, nh: nonhominini eutherian, h: hominini.

Page 78: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

58

3.5 Zinc finger protein 335 (ZNF335)

3.5.1 Evolutionary history of ZNF335

Evolutionary history of ZNF335 was reconnoitered by encompassing the orthologous

protein sequences from the representative species of mammalia, aves, reptilia,

actinistia, osteichthyes and chondrichthyes using distance based neighbor joining (NJ)

method (Figure 3.8). The unconstrained phylogenetic tree revealed that ZNF335 gene

originated during the early history of vertebrates (Figure 3.8). Furthermore, phylogeny

also displayed that representative member of chondrichthyes Callorhinchus milii was

not consistent with the vertebrate phylogeny (Figure 3.8). However, this inconsistency

might have been either due to the Callorhinchus milii genome is the slowest evolving

genome in all extant vertebrates or might be because of osteichthyes genomes are

highly derived (Venkatesh et al., 2014). Bidirectional based similarity search strategy

was unable to identify any putative ortholog of human ZNF335 among

cephalochordate, hemichordate, echinodermata, arthropoda, nematoda, mollusca,

cnidaria, placozoa and porifera. No paralog of human ZNF335 was identified by

similarity search approaches and any public database. Absence of ortholog in

invertebrates and not identifying any paralog of human ZNF335 strengthen the

deduction of vertebrate specific origin of ZNF335.

3.5.2 Molecular evolution of ZNF335 in mammals by site models

To detect selective pressure across primates (18 species), nonprimate mammals (27

species) and placental mammals (45 species), codon substitution site models were

used. These site models allow selective pressure (ω) to vary among sites and

prohibited among lineages. According to the result of one ratio model that assume

average ω ratio for all sites in protein revealed dominating role of purifying selection

in the evolution of ZNF335 in primates, nonprimate mammals and all placental

mammals (Table 3.10). In order to detect variable selective pressure and positive

selection on individual codon, three different pairs of site models (M1/ M2, M7/M8

and M8a/M8) were used. For this study, positive selection considered if two out of

these three pairs significantly reject neutrality. After deleting gaps, 3993 sites were

considered for site analysis in primates. The LRT statistics was significant for only

M7/M8 comparison while other two comparisons (M1/M2 and M8a/M8) do not

support the adaptive sequence evolution of ZNF335 in primates (Table 3.10). In

Page 79: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

59

nonprimate mammals, 3282 sites were examined. The site pairs M7/M8 and M8a/M8

models shows evidence for positive selection across nonprimate mammals (Table

3.10). In total, I found only 2% of positive selected sites in nonprimate mammals with

posterior probability of atleast greater than 95% (Table 3.10). In combined data of

primates and nonprimate placental mammals, 3795 sites were analyzed after removal

of gaps. For mammals, using M7/M8, M8a/M8 significant signature of pervasive

positive selection was found on subset of sites (Table 3.10). Parameters estimation

indicates that 2% of sites are under positive selection with ω value 1.4 (Table 3.10).

Figure 3. 8: Phylogenetic tree of MCPH10 gene ZNF335 using NJ approach

Page 80: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

60

The evolutionary distance was computed JTT matrix based method and branch lengths were drawn with

the same units of evolutionary distance that was used to infer phylogenic tree. Twenty nine orthologous

protein sequences of human ZNF335 were used in this analysis. All positions that contain gaps and

missing data are eradicated. The statistics present on the nodes represent bootstrap score. Bootstrap

value less than 50 are not shown here.

Table 3. 10: Parameter estimation and LRT for Mammals ZNF335.

Data group Model Parameter estimation (ω) Log likelihood

value

P

value

q

value

Primates M0: One ratio ω = 0.10153 -12084.684867

M1: Nearly neutral ω0 = 0.054, f0 = 0.93, ω1 =

1.0, f1= 0.066

-12016.245554 1.0 1.0

M2:Positive

selection

ω0 = 0.054, f0 = 0.93, ω1 =

1.0, f1 = 0.05392, ω2 = 1.0, f2

= 0.01249

-12016.245554

M7: p = 0.17006, q = 1.28065 -12021.052413 0.004 0.007

M8: p0 = 0.97137, p = 0.41625,

q = 4.69287, (p1 = 0.02863)

ω = 1.45597 (62L, 974V

BEB, 62L, 974V NEB)

-12015.542921

M8a: p0 = 0.9415, p = 0.8569, q =

12.3542, (p1 = 0.0586) ω =

1.0

-12015.951088 0.37 0.38

Nonprimate M0: One ratio ω = 0.07781 -26830.40982

mammals M1: Nearly neutral ω0 = 0.045, f0 = 0.92, ω1 =

1.0, f1= 0.084

-26440.22043 1.0 1.0

M2:Positive

selection

ω0 = 0.045, f0 = 0.92, ω1 =

1.0, f1= 0.082, ω2 = 1.0, f2 =

0.0024

-26440.22043

M7: p = 0.21, q = 1.8333 -26314.22379 5⁎10-4 1⁎10-3

M8: p0 = 0.97249, p = 0.27027, q

= 3.3096, (p1 = 0.02751) ω

= 1.24 (198V, 932T)

-26302.159482

M8a: p0 = 0.9647 p = 0.248 q

= 2.8165, (p1 = 0.0353) ω

= 1.0

-26306.829686 0.002 0.008

Mammals M0: One ratio ω = 0.0793 -33009.5808

M1: Nearly neutral ω0 = 0.044, f0 = 0.905, ω1 =

1.0, f1= 0.094

-32513.4631 1.0 1.0

M2:Positive

selection

ω0 = 0.044, f0 = 0.905, ω1 =

1.0, f1= 0.094, ω2 = 55.0,

f2=0

-32513.4631

M7: p = 0.208, q = 1.765 -32303.6612 1⁎10-4 5⁎10-4

M8: p0 = 0.978, p = 0.248, q =

2.7792, (p1 = 0.0215) ω =

1.2 (198V, 932T)

-32289.9547

M8a: p0 = 0.97035, p = 0.23367,

q = 2.33818, (p1 = 0.02965)

ω = 1.0

-32299.26404 2⁎10-3 0.0002

Page 81: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

61

3.5.3 Signatures of episodic positive selection at various evolutionary stages

from ancestral primate to human terminal branch

To order to detect pattern of episodic positive selection at diverse evolutionary epochs

from ancestral primate branch to human terminal branch, branch-site test was

implemented to ZNF335 coding sequence alignment of 45 placental mammal species

(Table 3.11). Branch-site analysis calculation of LRT divulged that no significant

signature of positive selection was found in any branch analyzed (Table 3.11).

However, the higher rate of sequence acceleration (ratio of nonsynonymous to

synonymous ratio (ω) exceeds 1) was found in hominini ancestral branch as compared

to human and other ancestral branches analyzed but LRT statistics indicate that the

difference is not significant between null model and selection model (Table 3.11).

Table 3. 11: Branch-site analysis of ZNF335.

Branch ω2 LRT P value q value Positive selected

sites

Human 1.0 0.000002 1 0.98

Hominini 3.95 0.0305 0.86 0.98

Homininae 1.0 0 1 0.98

Hominidae 1.0 0.000002 1 0.98

Hominidea 1.0 0 1 0.98

Catarrhini 1.0 0 1 0.98

Simians 1.0 0 1 0.98

Haplorhini 1.0 0 1 0.98

Primates 1.0 0 1 0.98

3.5.4 Divergent selection pressure between different partitions of

mammalian phylogeny

The patterns of divergent selective pressure between partitions of mammalian

phylogenetic tree were determined by implemented the codon substitutions clade

model C (CmC) (Table 3.12). This model was concerned to those sites that have

evolved under different selection constraint between clades, and positive selection is

not essential for those sites. The significance of functional divergence was determined

by likelihood ratio test (LRT) (Table 3.12). Likelihood ratio tests (LRTs) of M2a_rel

(null model) against CmC_ primates, simians, catarrhini, greatapes, and hominini

indicated no significant patterns of functional divergence observed in any partition of

ZNF335 mammalian phylogenetic tree (Table 3.12). However, this provides strong

evidence that function of ZNF335 is unaltered in the evolutionary history of placental

mammals.

Page 82: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

62

Table 3. 12: Divergent selection constraint parameters estimation and likelihood scores for

ZNF335.

Model & Partition Parameter estimation (ω) Log likelihood

value (lnL)

P value q value

CmC-Primate p0 = 0.75, ω0 = 0.015, p1= 0.029, ω1 =

1.0, p2 = 0.221, ωnp = 0.255, ωp = 0.285

-32294.0420

0.30 0.58

CmC-simians

p0 = 0.751, ω0 = 0.0155, p1= 0.029, ω1 =

1.0, p2 = 0.22, ωns = 0.256, ωs = 0.312

-32293.5981 0.16 0.57

CmC-catarrhini p0 = 0.751, ω0 = 0.0154, p1= 0.029, ω1 =

1.0, p2 = 0.22, ωnc = 0.266, ωc = 0.27

-32293.5586

0.15 0.57

CmC-greatapes

p0 = 0.75, ω0 = 0.0154, p1= 0.029, ω1 =

1.0, p2 = 0.22, ωng = 0.259, ωg = 0.297

-32294.4826

0.66 0.66

CmC-hominini p0 = 0.75, ω0 = 0.0154, p1= 0.029, ω1 =

1.0, p2 = 0.22, ωnh = 0.26, ωh = 0.25

-32294.5720

0.92 0.69

M2a_rel p0 = 0.75, ω0 = 0.015, p1 = 0.029, ω1 =

1.0, p2 = 0.221, ω2 = 0.26

-32294.5769 NA NA

np: nonprimate eutherian, p: primates, ns: nonsimians eutherian, s: simians, nc: noncatarrhini, c:

catarrhini, ng: nongreatapes eutherian, g: greatapes, nh: nonhominini eutherian, h: hominini.

3.6 Polyhomeotic homolog 1 (PHC1)

3.6.1 Phylogenetic analysis of PHC1

Evolutionary history and relationship between PHC1 and its putative paralogs PHC2

and PHC3 was studied by including the protein sequence of phyla mollusca,

arthropoda, hemichordata, cephalochordata and vertebrata using neighbor joining (NJ)

method (Figure 3.9). Phylogenetic analysis imparted that two gene duplication events

are responsible for the expansion of polyhomeotic homolog family in vertebrates

(Figure 3.9). The tree topology further unveiled first duplication event has betided

during the earliest evolutionary history of vertebrates, atleast prior to chondrichthyes-

osteichthyes split and responsible for the deduction of PHC3 paralog and ancestral

PHC1/PHC2 gene (Figure 3.9). The second duplication diverged PHC2 and PHC1and

has occurred prior to sarcopterygii-actinopterygii split but after the divergence of

osteichthyes from chondrichthyes (Figure 3.9). From the tree topology, it is appeared

that PHC1 and PHC2 are closely related genes, whereas PHC3 is the most ancient

gene and this gene family originated at the root of bilateria (Figure 3.9).

Page 83: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

63

Figure 3. 9: Phylogenetic tree of human PHC1 and its putative paralogs.

The evolutionary history of human PHC1 and its putative paralogs was inferred through NJ method

based on evolutionary distances computed by JTT matrix based method. Thirty seven amino acid

sequences were employed for tree reconstruction. All positions that contain gaps and missing data are

removed. The statistics present on the nodes portray bootstrap value. Bootstrap value greater than and

equal to 50 is shown here.

Page 84: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

64

3.6.2 Molecular evolution of PHC1 by site models

Forty six protein coding sequences of PHC1 were retrieved by bidirectional BLAST

search in Ensembl and NCBI database for placental mammals (twenty sequences are

from primate species and twenty six sequences are from nonprimate placental

mammals. Six codon substitution site models (M0, M1, M2, M7, M8, and M8a) were

used to analyze the protein coding sequence of PHC1 in three data groups i.e.,

primates, nonprimate placental mammals and placental mammals (Table 3.13). The

one ratio model (M0) for all three data groups indicated that purifying selection

dominated the evolution of PHC1 with estimates ω values ranging from 0.094-0.123

(Table 3.13). The M0 site model estimated the average selective pressure on overall

sites of protein coding gene while other five codon substitutions site model postulate

that selective pressure does not operate at constant rate on the whole protein coding

sequence but vary among codon sites of protein coding gene. Likelihood ratio tests

(LRTs) were calculated by comparing three pairs of site models (M1 & M2, M7 &M8,

and M8a & M8) in order to identified variable selective pressure on individual codon

site and patterns of positive selection on the above mention three data groups

(primates, nonprimate placental mammals and placental mammals) (Table 3.13).

Positive selection was considered optimal only if two out of three site pair

comparisons reject simplest neutral model (M1, M7 and M8a) in the favor of nested

complex alternative models (M2 and M8). The p values calculated from LRTs values

by chi square tables (Table 3.13). The LRTs values indicated that none of above

mentioned data group (primates, nonprimate placental mammals and placental

mammals) experienced significant signature of positive selection by three site pair

models (Table 3.13). Although the ω value for nonprimate placental mammals

estimated by M8 codon substitution site model exceeded by one which suggest

positive selection and also identified one site under positive selection with p value >=

0.05 (Table 3.13). But these signatures of positive selection are not significant when

corrected the p value for false discovery rate q value (Table 3.13).

3.6.3 Episodic Selection at PHC1 mammalian phylogeny

In first approach, codon substitution site models indicated that positive selection does

not operated on the PHC1 primates, nonprimate placental mammals and placental

mammals data groups. These models allow selective pressure to vary only among sites

Page 85: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

65

of protein coding genes but not across the braches of phylogeny (Table 3.14). But

positive selection also operated at specific evolutionary time points and at specific

species which are not identified by codon substitutions site models.

Table 3. 13: Parameter estimation and LRT for Mammals PHC1.

Data group Model Parameter estimation (ω) Log

likelihood

value

P

value

q

value

Primates M0: One ratio ω = 0.12368 -7903.776034

M1: Nearly neutral ω0 = 0.058, f0 = 0.92124, ω1 =

1.0, f1 = 0.07876

-7880.579564 1.0 1.0

M2:Positive

selection

ω0 = 0.058, f0 = 0.9212, ω1 = 1.0,

f1 = 0.025, ω2 = 1.0, f2 = 0.054

-7880.579564

M7: p = 0.15432, q = 1.03425 -7879.271927 1.0 0.83

M8: p0 = 0.9999, p = 0.15434, q =

1.03436, (p1 = 0.00001) ω = 1.0

-7879.271936

M8a: p0 = 0.97758, p = 0.11086, q =

0.77998,(p1 = 0.02242) ω =

1.0

-7879.250221 0.83 0.75

Nonprimate M0: One ratio ω = 0.09428 -17273.88625

mammals M1: Nearly neutral ω0 = 0.0432, f0 = 0.899, ω1 = 1.0,

f1 = 0.1001

-17095.19254 1.0 1.0

M2:Positive

selection

ω0 = 0.04319, f0 = 0.899, ω1 =

1.0, f1 = 01001, ω2 = 34.10, f2 =

0

-17095.19255

M7: p = 0.1873, q = 1.582 -17042.51078 0.37 0.35

M8: p0 = 0.9966, p = 0.1958, q =

1.738,(p1 = 0.0034) ω = 1.414

(478I BEB)

-17041.52137

M8a: p0 = 0.9817, p = 0.1783, q =

1.5162(p1 = 0.0183) ω = 1.0

-17043.89238 0.03 0.07

Mammals M0: One ratio ω = 0.098 -21063.3796

M1: Nearly neutral ω0 = 0.047, f0 = 0.898, ω1 = 1.0,

f1 = 0.102

-20818.3199 1.0 1.0

M2:Positive

selection

ω0 = 0.047, f0 = 0.898, ω1 = 1.0,

f1 = 0.102, ω2 = 58.0, f2 = 0

-20818.3199

M7: p = 0.197, q = 1.59 -20741.9953 0.08 0.08

M8: p0 = 0.982, p = 0.226, q = 2.20 (p1 = 0.0175) ω = 1.0

-20739.4396

M8a: p0 = 0.97385, p = 0.20298, q =

1.80393, (p1 = 0.02615) ω =

1.0

-20741.87595 0.03 0.07

Codon substitutions branch-site model was implemented to estimate the signature of

positive selection at specific evolutionary stages from primate ancestral branch to

modern humans and also among sites of protein coding genes of prespecified branch

(Table 3.14). The significance of this test was determined by comparing simplest null

model (that assume ω2 is fixed to one) and alternative model branch-site model (that

assume ω2 is greater than and equal to one). False discovery rate q value correction

Page 86: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

66

over p value was used to eradicate the false positive results in branch-site model. The

LRTs and q value suggested that there is not any single analyzed branch of PHC1 was

under positive selection (Table 3.14). Although the ω2 for haplorhini ancestral branch

is greater than one (ω2 = 14.927) but it is not significant when compared to null model

(Table 3.14).

Table 3. 14: Branch-site analysis of PHC1.

Branch ω2 LRT P value q value Positive selected

sites

Human 1 0.000002 1 0.98

Hominini 1 0 1 0.98

Homininae 1 0 1 0.98

Hominidae 1 0.000002 1 0.98

Hominidea 1 0.00002 1 0.98 No positive

Catarrhini 1 0.0076 0.93 0.98 selected site

Simians 1 0.2901 0.59 0.98

Haplorhini 14.927 0.4681 0.49 0.98

Primates 1 0.00006 0.99 0.98

3.6.4 Divergent selective constraints across PHC1 mammalian phylogeny

The signature of variable selection pressure across the PHC1 mammalian phylogeny

was estimated by clade model C (CmC) (Table 3.15). Clade model C assumes

variation in sites-specific selective pressure between predefined partitions of

phylogeny i.e., background branches and foreground branches. In case of PHC1

mammalian phylogeny, variations in sites-specific selective pressure were estimated

for five foreground branches (primates, simians, catarrhini, greatapes, and hominini)

(Table 3.15). The significance of CmC was determined by calculating LRTs of

M2a_rel (null model that estimates only one ω2 for all branches of phylogeny) against

CmC-primates, simians, catarrhini, greatapes and hominini (Table 3.15). False positive

results for CmC were eliminated by correcting false discovery rate q value over p

value (Table 3.15). Parameter estimations and LRTs suggested no signature of

divergent selection among the analyzed partitions of PHC1 mammalian phylogeny

(Table 3.15). However, these observations indicated that the function of PHC1 coding

gene was conserved throughout the evolution of eutherian mammals.

Page 87: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

67

Table 3. 15: Divergent selection constraint parameters estimation and likelihood scores for

PHC1.

Model & Partition Parameter estimation (ω) Log likelihood

value (lnL)

P value q value

CmC-Primate p0 = 0.784, ω0 = 0.020, p1= 0.021, ω1 =

1.0, p2 = 0.195, ωnp = 0.339, ωn = 0.395

-20737.4013

0.29 0.58

CmC-simians p0 = 0.194, ω0 = 0.348, p1= 0.021, ω1 =

1.0, p2 = 0.78, ωns = 0.020, ωs = 0.018

-20737.9248

0.82 0.66

CmC-catarrhini

p0 = 0.78, ω0 = 0.020, p1= 0.021, ω1 =

1.0, p2 = 0.19, ωnc = 0.347, ωc = 0.364

-20737.9367

0.86 0.66

CmC-greatapes p0 = 0.78, ω0 = 0.020, p1= 0.021, ω1 =

1.0, p2 = 0.19, ωng = 0.349, ωg = 0.179

-20737.3879 0.29 0.58

CmC-hominini p0 = 0.78, ω0 = 0.020, p1= 0.021, ω1 =

1.0, p2 = 0.19, ωnh = 0.35, ωh = 0.13

-20737.5787

0.38 0.65

M2a_rel

p0 = 0.784, ω0 = 0.020 p1= 0.021, ω1 =

1.0, p2 = 0.194, ω2 = 0.348

-20737.9512 NA NA

np: nonprimate eutherian, p: primates, ns: nonsimians eutherian, s: simians, nc: noncatarrhini, c:

catarrhini, ng: nongreatapes eutherian, g: greatapes, nh: nonhominini eutherian, h: hominini.

3.7 Cyclin Dependent Kinase 6 (CDK6)

3.7.1 Phylogenetic analysis of CDK6

Human CDK6 has one putative paralog CDK4 and evolutionary relationship between

these two paralog was probed by reconstructing the phylogenetic tree using neighbor

joining method (Figure 3.10). Phylogenetic tree revealed that gene duplication event

diverged CDK6 and CDK4 prior to osteichthyes-chondrichthyes split but after the

divergence of gnathostomata from cyclostomata (Figure 3.10). Bidirectional similarity

based approach was incapable to recognize the any putative ortholog of human

CDK4/CDK6 in phyla porifera. Phylogeny was further shown that CDK4/CDK6 gene

family originated at the root of parahoxozoa (placozoa, cnidaria and bilateria)

approximately 680 million years ago (Figure 3.10).

3.7.2 Molecular evolution of CDK6 by site model

Codon substitutions site models using maximum likelihood method were executed to

analyze the direction of natural selection operating on protein coding gene of

microcephaly locus 12 (Table 3.16). The value of selective pressure ω (ratio of

nonsynonymous to synonymous substitutions rate) indicates the direction of natural

selection such as ω = 1, ω < 1, and ω > 1 denote neutral evolution, negative selection,

and positive selection respectively. Site models were performed on three datasets of

CDK6 protein coding gene i.e., primates (17 sequences), nonprimate placental

mammals (26 sequences) and placental mammals (43 sequences). The one ratio model

Page 88: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

68

(M0) indicates the overall strength and direction of natural selection operating on

whole sequence of protein coding gene. The selective pressure ω measure by M0 site

model revealed that high negative constraint acting on CDK6 protein coding gene

throughout the evolution of eutherian (Table 3.16). Generally, it is acceptable that

natural selection acting differently on the every site of protein coding gene and

positive selection was operating on only a few sites that were not measure by M0 site

model. Three site pair models (M1 & M2, M7 & M8 and M8a & M8) were used to

measure positive selection (Table 3.16). Parameter estimations and p value revealed

that one site are under positive selection in placental mammals by one site pair model

M7 & M8 (Table 3.16). But according to stringent criteria defined for this study atleast

two out of three site pair models are required in the favor of positive selection.

However, overall these results suggest that no signals of positive selection were found

on CDK6 protein coding gene in primates, nonprimate placental mammals and

placental mammals‘ datasets.

3.7.3 Episodic positive selection on CDK6 phylogeny

In order to identify the evidence of adaptive evolution specific to the primate ancestral

lineage to modern human with the hypothesis that positive selection acting on CDK6

protein coding gene of above mentioned lineages might have contributed to the

prefrontal cortex expansion that start to occurred since the common ancestor of

primates. For this purpose, branch-site test using codon substitutions based maximum

likelihood method was performed on specific evolutionary stages of CDK6

mammalian phylogeny (Table 3.17). The significance of positive selection was

determined by estimating the LRTs of simplest null model (that is similar to branch-

site model except fixed ω2 = 1 for lineage of interest) against the most complex

alternative branch-site models (Table 3.17). False positive results were eradicated by q

value correction over p value. LRTs values revealed that no signals of positive

selection were found on primate ancestral branch to modern human branch (Table

3.17).

Page 89: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

69

Figure 3. 10: Evolutionary history of MCPH12 gene CDK6.

The phylogenetic tree of CDK6 was reconstructed through NJ method based on evolutionary distance

computed by JTT matrix based method. Thirty three amino acid sequences were employed for this analysis. All positions that contain gaps and missing data were eradicated. The statistics present on the

nodes indicate bootstrap value. Bootstrap value < 50 were not presented here.

Page 90: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

70

Table 3. 16: Parameter estimation and LRT for Mammals CDK6.

Data group Model Parameter estimation (ω) Log likelihood

value (lnL)

P

value

q

value

Primates M0: One ratio ω = 0.04042 -2035.596227

M1: Nearly neutral ω0 = 0.01613, f0 = 0.97157, ω1 = 1.0, f1 = 0.02843

-2032.561027

1.0 1.0

M2:Positive

selection

ω0 = 0.01613, f0 = 0.97157, ω1

= 1.0, f1 = 0.01248, ω2 = 1.0, f2

= 0.01595

-2032.561027

M7: p = 0.01416, q = 0.27849 -2032.354878 0.99 0.83

M8: p0 = 0.9999, p = 0.01366, q =

0.26870, (p1 = 0.00001) ω = 1.0

-2032.354826

M8a: p0 = 0.9746, p = 0.0358, q =

0.535, (p1 = 0.0254) ω = 1.0

-2032.534776 0.55 0.38

Nonprimate M0: One ratio ω = 0.0493 -4466.545497

mammals M1: Nearly neutral ω0 = 0.0238, f0 = 0.9608, ω1 =

1.0, f1 = 0.039

-4424.20087 1.0 1.0

M2:Positive selection

ω0 = 0.0238, f0 = 0.9608, ω1 = 1.0, f1 = 0.392, ω2 = 30.87, f2 =

0

-4424.20087

M7: p = 0.09732, q = 1.477 -4425.17022 0.04 0.04

M8: p0 = 0.867, p = 0.1736, q =

4.386, (p1 = 0.2095) ω = 1.0

-4421.95742

M8a: p0 = 0.97269, p = 0.15433, q = 3.44603, (p1 = 0.02731) ω = 1.0

-4422.36299 0.37 0.38

Mammals M0: One ratio ω = 0.04735 -5214.63577

M1: Nearly neutral ω0 = 0.0228, f0 = 0.958, ω1 =

1.0, f1 = 0.0416

-5165.32948 1.0 1.0

M2:Positive

selection

ω0 = 0.0228, f0 = 0.958, ω1 =

1.0, f1 = 0.0416, ω2 = 28.35, f2

= 0

-5165.32948

M7: p = 0.0929, q = 1.452 -5160.54685 0.02 0.03

M8: p0 = 0.99156, p = 0.11638, q =

2.42160, (p1 = 0.00844) ω =

1.3445 (302Y NEB, 302Y

BEB)

-5156.49740

M8a: p0 = 0.97951, p = 0.04652, q = 0.61023,(p1 = 0.02049) ω = 1.0

-5157.508379 0.16 0.21

Table 3. 17: Branch-site analysis of CDK6.

Branch ω2 LRT P value q value Positive selected

sites

Human 1 0.00006 0.99 0.98

Hominini 1 0.000002 1 0.98

Hominidae 2.94 0.00001 1 0.98

Hominidea 1 0.000004 1 0.98

Catarrhini 1 0.000004 1 0.98

Simians 1 0.000002 1 0.98

Primates 1 0 1 0.98

Page 91: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

71

3.7.4 Divergent selective constraint across CDK6 mammalian phylogeny

As mentioned above, signals of Darwinian positive selection were not found in CDK6

protein coding gene throughout the evolution of eutherian animals both by site and

branch-site models (Table 3.18). It is not necessary that phenotypic change such as

brain expansion occurred only if protein coding gene evolved adaptively. However,

phenotypic change may occur due to variable selective pressure acting on orders and

suborders of class eutherian. The patterns of divergent selective constraint across the

mammalian phylogeny were determined by performing clade model C (CmC) (Table

3.18). The significance of divergent selection constraint between different partitions of

phylogeny was determined by calculating the likelihood ratio tests (LRTs) of null

model M2a_rel against CmC-primate, simians, catarrhini, greatapes, and hominini

from log likelihood score of each test (Table 3.18). The parameters estimation and

LRTs indicated no patterns of divergent selection were found across CDK6

mammalian phylogeny (Table 3.18). These observations suggest that CDK6 protein

coding gene have conserved function throughout the evolution of eutherian animals.

Table 3. 18: Divergent selection constraint parameters estimation and likelihood scores for CDK6.

np: nonprimate eutherian, p: primates, ns: nonsimians eutherian, s: simians, nc: noncatarrhini, c:

catarrhini, ng: nongreatapes eutherian, g: greatapes, nh: nonhominini eutherian, h: hominini.

3.8 SAS-6 centriolar assembly protein (SASS6)

3.8.1 Evolutionary history of SASS6

The paralog of human SASS6 was not identified by similarity based approaches and in

the databases. So, the phylogenetic tree of SASS6 gene was reconstructed by including

protein orthologous sequences from the phyla porifera, mollusca, annelida,

urochordata and vertebrata (Figure 3.11). The vertebrate clade was outgrouped by

Model & Partition Parameter estimation (ω) Log likelihood

value (lnL)

P value q value

CmC-Primate p0 = 0.895, ω0 = 0.011, p1= 0.012, ω1 =

1.0, p2 = 0.0926, ωnp = 0.298, ωp = 0.257

-5156.14749 0.76

0.66

CmC-Simians p0 = 0.897, ω0 = 0.012, p1= 0.014, ω1 =

1.0, p2 = 0.088, ωns = 0.271, ωs = 0.428

-5155.82423 0.39 0.65

CmC-catarrhini p0 = 0.902, ω0 = 0.012, p1= 0.011, ω1 =

1.0, p2 = 0.087, ωnc = 0.32, ωc = 0.16

-5156.0186 0.55 0.66

CmC-greatapes p0 = 0.089, ω0 = 0.298, p1= 0.012, ω1 =

1.0, p2 = 0.899, ωng = 0.012, ωg = 0.00

-5156.0817 0.63 0.66

CmC-hominini p0 = 0.089, ω0 = 0.296, p1= 0.012, ω1 =

1.0, p2 = 0.898, ωnh = 0.012, ωh = 0.00

-5156.1273 0.71 0.66

M2a_rel p0 = 0.897, ω0 = 0.011, p1= 0.012, ω1 =

1.0, p2 =0.0906, ω2 = 0.294

-5156.19494 NA NA

Page 92: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

72

Ciona intestinalis with 99% bootstrap value. Branch lengths of teleost species are

longer as compared to other vertebrates‘ species, suggesting that SASS6 might have

been rapidly evolved in teleost in comparison with sarcopterygian (Figure 3.11).

Phylogeny further revealed that SASS6 originated during the early metazoan history

approximately 760 million years ago. Bidirectional blast best hit strategy was unable to

identify any putative paralog in cnidaria and placozoa.

Figure 3. 11: Evolutionary history of human SASS6 gene.

The phylogenetic tree of SASS6 was reconstructed using NJ method based on evolutionary distance

computed by JTT matrix based method. Twenty seven orthologous amino acid sequences of SASS6 were

used in this analysis. All those positions that contain missing data and gaps were eradicated prior to

phylogenetic tree reconstruction. The statistics at the nodes represent bootstrap score that was

established on the basis of 500 replicates. Scalar line denotes amino acid substitution per site.

Page 93: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

73

3.8.2 Molecular Evolution of SASS6 in mammals by site models

After a comprehensive bidirectional blast search in Ensembl and NCBI database, we

retrieved SASS6 orthologous coding sequences (CDS) for 44 placental mammals‘

species from which 19 sequences are primates and 25 sequences are nonprimate

placental mammals. Whether some sites in SAAS6 sequences are adaptively evolved

in three datasets of mammals i.e., primates, nonprimate placental mammals and

placental mammal are still unknown. To detect the signature of positive selection on

individual sites while neglecting the branches of phylogeny we implemented codon

substitutions site models (M0, M, M2, M7, M8, and M8a) separately on the above

mentioned three datasets (Table 3.19). The estimated ω ratio in one ratio model (M0)

for all three datasets (primates, nonprimate mammals and placental mammals) point

out that purifying selection dominated in the evolution of SASS6 (Table 3.19).

However, this estimation is based on average overall sites in coding sequence and all

branches in the phylogenetic tree, but in actual selection constraint vary on individual

site and only few sites on the sequences are adaptively evolved. To check the positive

selection calculated the LRTs for three pairs of models M1 vs. M2, M7 vs. M8, and

M8a vs. M8 for above mentioned three datasets (Table 3.19). The signature of positive

selection considered optimal only if two out of three pairs significantly reject neutral

model in the favor of alternative models. The LRTs indicated significant signatures of

positive selection only in nonprimate placental mammal‘s dataset through M7 vs. M8

and M8a vs. M8 (Table 3.19). Although the signatures of positive selection also

identified in primates and placental mammals but only with one pair model M7 vs.

M8. The LRT M7 vs. M8 was considered less conservative and accurate as compared

to M1 vs. M2 and M8a vs. M8. Sites under positive selection in nonprimate placental

mammal‘s dataset were determined by Bayes empirical Bayes (BEB) and Naïve

empirical Bayes (NEB) methods implemented in M8 codon substitutions site model

(Table 3.10). Four sites with NEB and one site with BEB were identified under

positive selection with ω value 2.1637 (Table 3.19).

Page 94: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

74

Table 3. 19: Parameter estimation and LRT for Mammals SASS6.

Data group Model Parameter estimation (ω) Log likelihood

value (lnL)

P

value

q

value

Primates M0: One ratio ω = 0.15291 -5765.65349

M1: Nearly neutral ω0 = 0.07856, f0 = 0.90514, ω1 = 1.0, f1 = 0.09486

-5727.58667

0.2 1.0

M2:Positive

selection

ω0 = 0.0827, f0 = 0.912, ω1 = 1.0,

f1 = 0.085, ω2 = 4.02, f2 = 0.0045

-5725.97915

M7: p = 0.2189, q = 1.056 -5730.802537 0.009 0.01

M8: p0 = 0.98602, p = 0.38308, q =

2.27271,(p1 = 0.01398) ω =

2.7464 (189N NEB BEB)

-5726.072798

M8a: p0 = 0.9064, p = 8.644, q =

98.99 (p1 = 0.09362) ω = 1.0

-5727.586436 0.08 0.13

Nonprimate M0: One ratio ω = 0.14845 -11890.58065

mammals M1: Nearly neutral ω0 = 0.0577, f0 = 0.8386, ω1 =

1.0, f1 = 0.1614

-11639.96481 1.0 1.0

M2:Positive selection

ω0 = 0.0577, f0 = 0.8386, ω1 = 1.0, f1 = 010492, ω2 = 1.0, f2 =

0.0565

-11639.96481

M7: p = 0.2024, q = 0.94835 -11614.68495 2⁎10-3 4⁎10-3

M8: p0 = 0.984, p = 0.2425, q =

1.36288, (p1 = 0.016) ω =

2.1637 (575V BEB, 99A, 520S, 546T, 575V NEB)

-11603.89414

M8a: p0 = 0.9197, p = 0.2736, q =

2.1856 (p1 = 0.0803) ω = 1.0

-11607.49238 0.007 0.02

Mammals M0: One ratio ω = 0.1441 -15073.1713

M1: Nearly neutral ω0 = 0.066, f0 = 0.852, ω1 = 1.0,

f1 = 0.148

-14764.4178 1.0 1.0

M2:Positive

selection

ω0 = 0.066, f0 = 0.852, ω1 = 1.0,

f1 = 0.148, ω2 = 24.78, f2 = 0

-14764.4178

M7: p = 0.237, q = 1.12056 -14717.7196 3⁎10-4 1⁎10-3

M8: p0 = 0.968, p = 0.308, q =

1.1978 (p1 = 0.032) ω = 1.4611

(99A, 189M, 494A, 520S,546T

NEB)

-14705.1686

M8a: p0 = 0.9265, p = 0.32266, q =

2.51318, (p1 = 0.07350) ω =

1.0

-14706.7643 0.07 0.13

3.8.3 Signature of episodic positive selection at SASS6 mammalian

phylogeny

The above codon substitutions site models predicted selective pressure that vary

among sites across the phylogeny but selective pressure may also vary among the

branches of the phylogeny. Darwinian positive selection could be take place only at

specific evolutionary stages or at specific species of phylogeny and only affected few

sites in a protein coding sequence with ω ratio great than one. The codon substitutions

branch-site model was implemented in order to detect signature of transient positive

Page 95: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

75

selection in ancestral primate‘s branch to human terminal branch (Table 3.20). The

significance of branch-site model was determined by calculating the LRTs for null

model that fixed ω2 = 1 against prespecified branch-site model (Table 3.20). False

positive results for branch-site model were controlled by calculating false discovery

rate q value over p value. These result indicated that no significant signatures of

positive selection were took place at any SASS6 ancestral branch analyzed to modern

human terminal branch (Table 3.20). These branch-site results are similar to above site

models calculations for primate‘s dataset of SASS6 that also suggested no signature of

positive selection in primates (Table 3.19 and Table 3.20).

Table 3. 20: Branch-site analysis of SASS6.

Branch ω2 LRT P value q value Positive selected

Sites

Human 1 0 1 0.98

Hominini 4.64 0.1646 0.68 0.98

Hominidae 1 0.0917 0.76 0.98

Hominidea 1 0 1.0 0.98

Catarrhini 1 0.0689 0.79 0.98

Simians 1 0.2159 0.64 0.98

Haplorhini 1 0.00008 0.99 0.98

Primates 1 0.000002 1 0.98

3.8.4 Divergent selective constraints between partitions of SASS6

mammalian phylogeny

Positive selection is not necessarily required for functional divergence at specific

stages of evolution; variation in site specific selective pressure between different

partitions of phylogeny might be responsible for divergence pattern and adaptive

function in evolution. Clade model C (CmC) was used to detect such complex forms of

divergence in selective pressure between clades or partitions of SASS6 mammalian

phylogeny (Table 3.21). The significant of CmC was determined by calculating LRTs

for M2a_rel against CmC-primates, simians, catarrhini greatapes and hominini. The

LRTs indicated that divergent selective pressure between simians (ω = 0.499) and

nonsimians placental mammals (ω = 0.214) and between hominini (ω = 1.082) and

nonhominini placental mammals (0.26) (Table 3.21). Parameter estimations under

CmC-simians recommended that large proportion of sites (58%) evolving under strong

negative selection with ω value 0.013, 8% of sites neutrally evolved and 34% of sites

evolving under divergent selective pressure between simians (ω = 0.499) and

nonsimians placental mammals (ω = 0.214) (Table 3.21). Though parameter

estimations point out big difference in selective pressure between hominini and

Page 96: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

76

nonhominini placental but the difference is not significant after p value correction for

false discovery rate (Table 3.21).

Table 3. 21: Divergent selection constraint parameters estimation and likelihood scores for

SASS6.

Model & Partition Parameter estimation (ω) Log likelihood

value (lnL)

P value q value

CmC-Primate p0 = 0.612, ω0 = 0.017, p1= 0.077, ω1 =

1.0, p2 = 0.310, ωnp = 0.253, ωp = 0.270

-14706.9216

0.67 0.66

CmC-simians

p0 = 0.58, ω0 = 0.013, p1= 0.082, ω1 =

1.0, p2 = 0.34, ωns = 0.214, ωs = 0.499

-14697.4440 0.00001 0.0003

CmC-catarrhini

p0 = 0.31, ω0 = 0.26, p1= 0.077, ω1 =

1.0, p2 = 0.62, ωnc = 0.017, ωc = 0.024

-14706.9516 0.72 0.66

CmC-greatapes p0 = 0.61, ω0 = 0.17, p1= 0.078, ω1 =

1.0, p2 = 0.31, ωng = 0.25, ωg = 0.47

-14706.3050

0.23 0.57

CmC-hominini p0 = 0.62, ω0 = 0.017, p1= 0.077, ω1 =

1.0, p2 = 0.31, ωnh = 0.26, ωh = 1.082

-14705.0071

0.045 0.33

M2a_rel p0 = 0.614, ω0 = 0.017, p1 = 0.077, ω1 =

1.0, p2 = 0.308, ω2 = 0.259

-14707.0141 NA NA

np: nonprimate eutherian, p: primates, ns: nonsimians eutherian, s: simians, nc: noncatarrhini, c: catarrhini, ng: nongreatapes eutherian, g: greatapes, nh: nonhominini eutherian, h: hominini.

3.9 Major Facilitator Superfamily Domain Containing 2A (MFSD2A)

3.9.1 Phylogenetic analysis of MFSD2A

Phylogenetic tree for MFSD2A and its putative paralogs was reconstructed using

neighbor joining (NJ) method in order to identify the origin and evolutionary

relationship between the major facilitator superfamily domain family paralogs (Figure

3.12). Phylogenetic tree revealed that two duplication events were responsible for the

expansion of this family (Figure 3.12). First duplication event has arose during the

early metazoan history, before bilaterian-nonbilaterian split and produced most ancient

member of this family MFSD12 gene and MFSD2A/MFSD2B ancestral gene (Figure

3.12). Second duplication event diverged MFSD2A and MFSD2B and has occurred

prior to actinopterygii-sarcopterygii split and after the divergence of vertebrates from

cephalochordate (Figure 3.12). Phylogenetic tree further showed that teleost specific

duplication event occurred in MFSD2A gene approximately 310 million years ago

(Figure 3.12). From the tree topology pattern it appears that MFSD2A and MFSD2B

are closely related genes, whereas MFSD12 gene is very distantly related to this

subfamily (MFSD2A and MFSD2B) (Figure 3.12).

Page 97: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

77

Figure 3. 12: Phylogenetic tree of MCPH15 gene MFSD2A gene

The phylogenetic tree of MFSD2A was reconstructed using NJ method based on evolutionary distance computed by JTT matrix based method. The statistics at braches represent bootstrap score (only value

≥50% is shown) that was established on the basis of 500 replicates. Scalar bar denote amino acid

substitution per site.

Page 98: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

78

3.9.2 Pervasive adaptive evolution of MFSD2A in placental mammals

Signal of positive selection was examined in microcephaly loci 15 that encode

MFSD2A protein coding gene by performing codon substitutions site models (M0,

M1, M2, M7, M8, and M8a) using maximum likelihood method on three data groups

of placental mammals (primates, nonprimate placental mammals, and combined data

of primates and nonprimate placental mammals i.e., all placental mammals) (Table

3.22). The estimated ω ratio by one ratio model (M0) for all three above mentioned

data groups revealed that purifying selection dominated the evolution of euthrian

MFSD2A (Table 3.22). Signal of positive selection considered optimal if two out of

three site pairs (M1 & M2, M7 & M8, and M8a & M8) models rejected the neutral

model (M1, M7 and M8a) in the favor of alternative M2 and M8 models (positive

selection models). Parameter estimations suggested that signal of positive selection

was identified only in the combined data set of primate and nonprimate placental

mammals by two site pairs M7 & M8 and M8a & M8 (Table 3.22).

Six and four positive selected sites were also pinpointed by both NEB and BEB

methods respectively, these methods are implemented in M8 codon substitution

maximum likelihood method (Table 3.22). But the p value correction by false

discovery rate q value suggested that signal of positive selection in placental mammals

are not significant according to stringent criteria of positive selection for this study.

3.9.3 Episodic adaptive evolution across the MFSD2A mammalian

Phylogeny

Previous studies have proposed that some of microcephaly genes (MCPH1, WDR62,

CDK5RAP2 and ASPM) have evolved at accelerated rate along specific primates

evolutionary stages. To detect the signature of episodic selection in MFSD2A protein

coding gene, branch-site model was performed at various evolutionary time points

from ancestral primate branch to modern human terminal branch by using codon

substitution maximum likelihood method (Table 3.23). Branch-site model allow ω to

vary not across the branches of the phylogeny but also among the sites of prespecified

lineage of interest. The significance of positive selection was determined by

calculating LRTs of null model (that is similar to branch-site model except ω2 is fixed

to one) against alternative branch-site model of prespecified linages of interest (Table

3.23). Parameter estimations suggested that protein coding sequence accelerated in

Page 99: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

79

simian, catarrhini and hominini ancestral branch with ω2 22.54, 11.65, and 3.195

respectively. But the LRTs values rejected these acceleration and suggested that none

of the analyzed branch of MFSD2A protein coding gene significantly evolved under

Darwinian positive selection (Table 3.23)

Table 3. 22: Parameter estimation and LRT for Mammals MFSD2A.

Data group Model Parameter estimation (ω) Log likelihood

value (lnL)

P

value

q

value

Primates M0: One ratio ω = 0.13249 -4444.611090

M1: Nearly neutral ω0 = 0.06104, f0 = 0.90335, ω1 = 1.0, f1 = 0.09665

-4421.032654

1.0 1.0

M2:Positive

selection

ω0 = 0.061, f0 = 0.9034, ω1 =

1.0, f1 = 0.0009, ω2 = 1.0, f2 =

0.09579

-4421.032654

M7: p = 0.18941, q = 1.12348 -4421.078982 0.61 0.55

M8: p0 = 0.965, p = 0.3059, q =

2.5129 (p1 = 0.03519) ω = 1.27048

-4420.582485

M8a: p0 = 0.9308, p = 0.3499, q =

3.4023, (p1 = 0.0692) ω = 1.0

-4420.593859 0.88 0.77

Nonprimate M0: One ratio ω = 0.10915 -9498.29415

mammals M1: Nearly neutral ω0 = 0.0438, f0 = 0.8756, ω1 = 1.0, f1 = 0.1244

-9291.79433

1.0 1.0

M2:Positive

selection

ω0 = 0.04384, f0 = 0.8756, ω1 =

1.0, f1 = 01244, ω2 = 14.88, f2 =

0

-9291.79433

M7: p = 0.1559, q = 1.02 -9260.06512 0.03 0.04

M8: p0 = 0.944, p = 0.219, q =

2.445, (p1 = 0.056) ω = 1.0

-9256.61184

M8a: p0 = 0.92897, p = 0.1985, q =

2.161, (p1 = 0.07103) ω = 1.0

-9257.53496 0.17 0.21

Mammals M0: One ratio ω = 0.10830 -11628.04642

M1: Nearly neutral ω0 = 0.0458, f0 = 0.882, ω1 =

1.0, f1 = 0.1179

-11355.56887 0.99 1.0

M2:Positive selection

ω0 = 0.0458, f0 = 0.882, ω1 = 1.0, f1 = 0.1179, ω2 = 25.96, f2 =

0

-11355.56893

M7: p = 0.16574, q = 1.0699 -11307.34092 0.0002 0.0004

M8: p0 = 0.9559, p = 0.2321,q =

2.4611, (p1 = 0.04409) ω =

1.1129 (196S,

198T,209R,335V,437E, 438R NEB, 209R, 335V, 437E, 438R

BEB)

-11298.89537

M8a: p0 = 0.9366, p = 0.221, q =

2.311, (p1 = 0.0634) ω = 1.0

-11301.04443 0.04 0.08

Page 100: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

80

Table 3. 23: Branch-site analysis of MFSD2A.

Branch ω2 LRT P value q value Positive selected

sites

Human 1 0 1 0.98

Hominini 3.195 0.00003 1 0.98

Homininae 1 0.0648 0.79 0.98

Hominidae 1 0.0528 0.82 0.98

Catarrhini 11.65 0.2656 0.61 0.98

Simians 22.54 2.1289 0.14 0.98

Primates 1 0.00002 1.0 0.98

3.9.4 Divergent selective constraint across the MFSD2A mammalian

phylogeny

The significant signals of positive selection in MFSD2A protein coding sequences

were not found by codon substitution site models and branch-site model (Table 3.22

and Table 3.23). But positive selection is not necessary for functional divergence of

protein coding gene between the groups of orthologous sequences in the phylogeny.

Clade model C (CmC) was used to detect such complex form of divergent selective

pressure between the partitions of MFSD2A mammalian phylogeny (Table 3.24). The

significance of divergent selective constraint between the partitions of phylogeny was

determined by comparing null model M2a_rel against alternative CmC-primate,

simians, catarrhini, greatapes and hominini models. The parameter estimations and p

value indicated divergent selection constraint occurred between simians (ω = 0.38) and

nonsimians placental mammals (ω = 0.23) (Table 3.24). But after eradicated the false

positive results of CmC by calculating false discovery rate q value correction over p

value revealed that none of the analyzed partitions of MFSD2A mammalian phylogeny

experienced divergent selective pressure. However, these observations suggest that

MFSD2A protein coding gene hold conserved function across the eutherian evolution.

Table 3. 24: Divergent selection constraint parameters estimation and likelihood scores for

MFSD2A.

Model & Partition Parameter estimation (ω) Log likelihood

value (lnL)

P

value

q value

CmC-Primate p0 = 0.731, ω0 = 0.0152, p1= 0.062, ω1 = 1.0,

p2 = 0.207, ωnp = 0.257, ωp = 0.279

-11299.7769

0.66 0.66

CmC-simians p0 = 0.719, ω0 = 0.014, p1= 0.066, ω1 = 1.0,

p2 = 0.21, ωns = 0.23, ωs = 0.38

-11297.2803

0.02 0.30

CmC-catarrhini

p0 = 0.725, ω0 = 0.015, p1= 0.065, ω1 = 1.0,

p2 = 0.21, ωnc = 0.25, ωc = 0.33

-11299.5559 0.43 0.66

CmC-greatapes p0 = 0.73, ω0 = 0.015, p1= 0.062, ω1 = 1.0,

p2 = 0.21, ωng = 0.262, ωg = 0.289

-11299.8560

0.85 0.66

CmC-hominini

p0 = 0.73, ω0 = 0.015, p1= 0.061, ω1 = 1.0,

p2 = 0.21, ωnh= 0.266, ωh = 0.196

-11299.8222

0.75 0.66

M2a_rel

p0 = 0.732, ω0 = 0.0154, p1= 0.061, ω1 = 1.0,

p2 = 0.206, ω2 = 0.265

-11299.8738

NA NA

Page 101: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

81

np: nonprimate eutherian, p: primates, ns: nonsimians eutherian, s: simians, nc: noncatarrhini, c:

catarrhini, ng: nongreatapes eutherian, g: greatapes, nh: nonhominini eutherian, h: hominini.

3.10 Citron rho-interacting serine/threonine kinase (CIT)

3.10.1 Evolutionary history of CIT gene

Phylogenetic tree of CIT gene was constructed by the orthologous protein sequences

from the metazoan species (Figure 3.13). Phylogeny showed that CIT is originated at

the root of parahoxozoa (placozoa, cnidaria and bilateria) (Figure 3.13). Furthermore,

phylogeny also revealed that lineage specific duplication occurred at the root of teleost

fish approximately 310 million years ago (Figure 3.13). Ensembl genome browser

shows five paralog (CDC42BPA, CDC42BPG, CDC42BPG, ROCK1 and ROCK2) of

CIT. Evolutionary relationship between all these Ensembl genome browser predicted

paralogs was estimated, which revealed CIT gene is distantly related to all five

Ensembl predicted paralog genes, suggesting these genes might have not been the

putative paralogs of CIT.

3.10.2 Molecular evolution of CIT across eutherian

A statistical approach was used to study selective pressure acting on CIT protein gene

in 37 species of placental mammals from which 15 species are primates and 22 are

nonprimate placental mammals. Six site model (M0, M1, M2, M7, M8, and M8a)

using codon based maximum likelihood method were performed on separately on three

data groups i.e., primates (15 species), nonprimate placental mammals (22 species) and

placental mammals (37 species). (Table 3.25) The simplest one ratio model (M0)

measure average ω ratio on overall amino acid sites of protein coding gene. The

selective pressure estimated by M0 codon substitutions site model revealed that

extreme negative selection dominated the evolution of CIT protein coding gene in all

three data groups (primate, nonprimate placental mammals and all placental mammals)

with ω value 0.03 (Table 3.25). The other five site models was implemented to

compute the likelihood ratio tests for three site pairs M1 (neutral model) against M2

(selection model), M7 (beta) against M8 (beta & ω) and M8a (beta & ω = 1) against

M8 (beta & ω) in order to check the signs of positive selection on CIT protein coding

gene in all three above mentioned data groups of eutherian mammals (Table 3.25).

Parameters estimations and LRTs indicated that patterns of positive selection were

spotted in primates by only one site pair M7 vs. M8a and in nonprimate placental

Page 102: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

82

mammals by two site pair models M7 vs. M8 and M8a vs. M8 (Table 3.25). For this

study signals of positive selection were considered optimal only if two out of three site

pair models detected adaptive evolution. Under this criterion positive selection is

acting only in nonprimate placental mammals (Table 3.25). Bayes empirical Bayes

(BEB) method detected two sites that have greater than 95% probability to be under

positive selection in nonprimate placental mammals (Table 3.25).

Figure 3. 13: Evolutionary history of human CIT gene.

The phylogenetic tree of CIT was reconstructed using NJ method based on evolutionary distance

computed by JTT matrix based method. All positions that contain gaps and missing data are eradicated.

Twenty four amino acid sequences were used in this analysis.

Page 103: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

83

Table 3. 25: Parameter estimation and LRT for Mammals CIT.

Data group Model Parameter estimation (ω) Log likelihood

value (lnL)

P

value

q

value

Primates M0: One ratio ω = 0.03313 -14283.42317

M1: Nearly neutral ω0 = 0.0165, f0 = 0.981, ω1 = 1.0, f1 = 0.0187

-14240.91466 0.69 1.0

M2:Positive

selection

ω0 = 0.0165, f0 = 0.981, ω1 =

1.0, f1 = 0.0187, ω2 = 21.34,

f2 = 0

-14240.91669

M7: p = 0.0148, q = 0.310 -14245.90441 0.005 0.008

M8: p0 = 0.987, p = 1.923, q =

99.0, (p1 = 0.013) ω = 1.35 (99I, 253N, 255R, 304T,

308S, 625A, 1834S NEB,

625A BEB)

-14240.57215

M8a: p0 = 0.982, p = 1.77, q =

99.0, (p1 = 0.0185) ω = 1.0

-14240.95137 0.38 0.38

Nonprimate M0: One ratio ω = 0.03576 -29309.608609

mammals M1: Nearly neutral ω0 = 0.1634, f0 = 0.96225, ω1

= 1.0, f1 = 0.03775

-28964.379310

1.0 1.0

M2:Positive

selection

ω0 = 0.1634, f0 = 0.96225, ω1

= 1.0, f1 = 0.3775, ω2 =

48.43, f2 = 0

-28964.373910

M7: p = 0.06866, q = 1.2419 -28925.116597 4*10-9 4⁎10-8

M8: p0 = 0.98525, p = 0.08625, q

= 2.5095, (p1 = 0.01475) ω

= 1.3 (12S, 227T BEB)

-28905.698238

M8a: p0 = 0.97683, p = 0.03155, q

= 0.399, (p1 = 0.02317) ω =

1.0

-28914.804177 2⁎10-3 0.0002

Mammals M0: One ratio ω = 0.03559 -35783.34624

M1: Nearly neutral ω0 = 0.0165, f0 =0.9613, ω1 =

1.0, f1= 0.03874

-35349.637428 1.0 1.0

M2:Positive

selection

ω0 = 0.0165, f0 = 0.9613, ω1 =

1.0, f1 = 0.03875, ω2 =

136.65, f2 = 0

-35349.637429

M7: p = 0.07159, q = 1.2654 -35273.812696 5⁎10-4 1⁎10-4

M8: p0 = 0.97858, p = 0.03136, q

= 0.38527, (p1 = 0.02142) ω

= 1.0

-35261.612543

M8a: p0 = 0.97858, p = 0.02221, q

= 0.2331, (p1 = 0.02142) ω=

1.0

-35261.613592 0.96 0.80

3.10.3 Molecular evolution of CIT protein coding gene by branch-site

model

The above codon substitutions site model approach is incapable to identify positive

selection acting on a short period of time and affects only a fraction of codons. The

branch-site model is able to detect lineage specific selective pressure changes on

specific codons. To determine episodic positive selection signatures, branch-site model

Page 104: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

84

was performed on specific evolutionary stages of CDK6 mammalian phylogeny more

specifically from primate ancestral lineage to modern human terminal branch (Table

3.26). The significance of positive selection was determined by conducting likelihood

ratio tests (LRTs) of null model (similar to branch-site model except ω2 is fixed to one

in predefined lineage of interest) against branch-site model from log likelihood score

of each test (Table 3.26). Parameters estimation and LRTs revealed that only simian‘s

ancestral branch evolved adaptively (Table 3.26). Bayes empirical Bayes (BEB)

method implemented in branch-site model pinpoint one positive selected codon site

with greater than 99% probability (Table 3.26). But when false positive results are

eliminated by calculating q value over p value revealed that no analyzed branch is

under positive selection (Table 3.26).

Table 3. 26: Branch-site analysis of CIT.

Branch ω2 LRT P value q value Positive selected

sites

Human 1 0 1 0.98

Hominini 1 0.0003 0.98 0.98

Homininae 1 0.000006 1 0.98 Hominidae 1 0.000002 1 0.98

Catarrhini 1 0.000004 1 0.98

Simians 999 8.3012 0.004 0.11 1897A⁎⁎

Haplorhini 1.06 0.0056 0.94 0.98

Primates 1 0.0005 0.98 0.98

3.10.4 Divergent selective pressure across CIT mammalian phylogeny

In order to check divergence in selective pressure between different partitions of CIT

mammalian phylogeny, Clade model C (CmC) was performed (Table 3.27). CmC

accommodates both the heterogeneity and divergence in selective pressure among the

sites. The significance of divergence in selective pressure between the partitions of

CIT mammalian phylogeny was tested by conducting likelihood ratio tests (LRTs) of

null model M2a_rel against CmC-primates, simians, catarrhini, greatapes, and

hominini from log likelihood score of each test (Table 3.27). Parameter estimation and

LRTs revealed that no patterns of divergent selective constraint were observed in any

single partition of CIT mammalian phylogeny (Table 3.27). This suggests that CIT

protein coding gene responsible to perform similar function throughout the eutherian

mammals.

Page 105: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

85

Table 3. 27: Divergent selection constraint parameters estimation and likelihood scores for CIT.

Model & Partition Parameter estimation (ω) Log likelihood

value (lnL)

P value q value

CmC-Primate p0 = 0.89, ω0 = 0.0058, p1= 0.014, ω1 =

1.0, p2 = 0.095, ωnp = 0.24, ωp = 0.19

-35246.2438 0.18 0.57

CmC-simians p0 = 0.095, ω0 = 0.226, p1= 0.015, ω1 =

1.0, p2 = 0.89, ωns = 0.0059, ωs = 0.0043

-35246.9740 0.57 0.66

CmC-catarrhini p0 = 0.094, ω0 = 0.227, p1= 0.015, ω1 =

1.0, p2 = 0.89, ωnc = 0.0057, ωc = 0.0088

-35246.8939 0.49 0.66

CmC-greatapes p0 = 0.89, ω0 = 0.0058, p1= 0.015, ω1 = 1.0, p2 = 0.094, ωng = 0.2279, ωg = 0.2074

-35247.1167 0.86 0.66

CmC-hominini p0 = 0.89, ω0 = 0.0057, p1= 0.015, ω1 =

1.0, p2 = 0.095, ωnh = 0.224, ωh = 0.379

-35246.9556 0.55 0.66

M2a_rel p0 = 0.89, ω0 = 0.0058, p1 = 0.015, ω1 =

1.0, p2 = 0.095, ω2 = 0.227

-35247.1321 NA NA

np: nonprimate eutherian, p: primates, ns: nonsimians eutherian, s: simians, nc: noncatarrhini, c:

catarrhini, ng: nongreatapes eutherian, g: greatapes, nh: nonhominini eutherian, h: hominini.

3.11 Kinesin Family Member 14 (KIF14)

3.11.1 Evolutionary history of KIF14

Putative paralog of KIF14 was not detected by similarity search based approaches and

in any public databases and genome browsers. KIF14 phylogeny was reconstructed

through neighbor joining (NJ) method by encompassing the orthologous sequences

from the representative species of kingdom animalia (Figure 3.14). Phylogenetic tree

revealed that KIF14 gene is originated during the early evolutionary history of

metazoan (Figure 3.14). Bidirectional blast best hit approach was failed to detect any

ortholog of KIF14 in Pteromyzon marinus, Drosophila melanogaster, Caenorhabditis

elegans and Trichoplax adhaerens.

3.11.2 Pervasive adaptive evolution in KIF14 across eutherian mammals

First checked whether the imprint of pervasive positive selection is present on KIF14

protein coding gene by performing codon substitutions site models (M0, M1, M2, M7,

M8, and M8a) separately on three data groups i.e., primates, nonprimate placental

mammals and placental mammals (Table 3.28). The estimated selective pressure from

the simplest one ratio model revealed that overall negative selection dominated the

evolution of KIF14 with ω values ranging from 0.275-0.335 (Table 3.28). The one

ratio model estimated the average ω for all sites in protein coding gene and unable to

identify positive selection on individual codon site. Three site pair‘s models (M1 vs.

M2, M7 vs. M8 and M8a vs. M8) were used to measure the strength of positive

selection acting on individual codon. For this purpose, calculated likelihood ratio tests

Page 106: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

86

(LRTs) for three site pair‘s models (M1 vs. M2, M7 vs. M8 and M8a vs. M8) from log

likelihood scores estimated for five site models (M1, M2, M7, M8, and M8a). The

parameters estimation and LRTs values revealed evidence for positive across primates

and placental mammals with all three sit pairs models (M1 vs. M2, M7 vs. M8 and

M8a vs. M8) and two site pairs model (M7 vs. M8 and M8a vs. M8) respectively

(Table 3.28). Naïve empirical Bayes (NEB) method implemented in M2 selection

model identified one site with posterior probability > 95 in primates, while M8 site

model using Naïve empirical Bayes (NEB) and Bayes empirical Bayes (BEB) methods

identified two sites that have probability to be evolved under positive selection in

primates (Table 3.28). In placental mammals, positive selected sites pinpointed by

Naïve empirical Bayes (NEB) are three and by Bayes empirical Bayes (BEB) method

is one (Table 3.28).

Figure 3. 14: Evolutionary history of human KIF14 gene.

Page 107: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

87

The phylogenetic tree of KIF14 was reconstructed using NJ method based on evolutionary distance

computed by JTT matrix based method. All positions that contain gaps and missing data are eradicated.

The numbers at each branches represent bootstrap score.

Table 3. 28: Parameter estimation and LRT for Mammals KIF14.

Data group Model Parameter estimation (ω) Log likelihood

value (lnL)

P

value

q

value

Primates M0: One ratio ω = 0.33576 -14591.300046

M1: Nearly neutral ω0 = 0.1172, f0 = 0.7345, ω1 =

1.0, f1 = 0.2655

-14497.808970 0.002 0.0

M2:Positive

selection

ω0 = 0.1258, f0 = 0.739, ω1 =

1.0, f1 = 0.253, ω2 = 5.59, f2 =

0.0074 (855H NEB)

-14491.077188

M7: p = 0.256, q = 0.4773 -14502.410656 3⁎10-3 0.0006

M8: p0 = 0.9868, p = 0.3511, q =

0.715, (p1 = 0.0132) ω = 4.35

(855H, 1477R NEB, 855H,

1477R BEB)

-14492.030684

M8a: p0 = 0.7359, p = 13.408, q =

99.0, (p1 = 0.264) ω = 1.0

-14497.848037 0.0006 0.004

Nonprimate M0: One ratio ω = 0.2751 -24123.222491

mammals M1: Nearly neutral ω0 = 0.1393, f0 = 0.72708, ω1 =

1.0, f1 = 0.27292

-23863.957733

1.0 1.0

M2:Positive

selection

ω0 = 0.13933, f0 = 0.72708, ω1

= 1.0, f1 = 0.1965, ω2 = 1.0, f2

= 0.0764

-23863.957733

M7: p = 0.512, q = 1.1425 -23817.981687 0.01 0.02

M8: p0 = 0.98039, p = 0.57517, q

= 1.4066, (p1 = 0.01961) ω =

1.756 (239T BEB)

-23813.387527

M8a: p0 = 0.90368, p = 0.62754, q

= 1.926, (p1 = 0.09632) ω =

1.0

-23815.732058 0.03 0.07

Mammals M0: One ratio ω = 0.28548 -31622.75422

M1: Nearly neutral ω0 = 0.13362, f0 = 0.70483, ω1

= 1.0, f1 = 0.29517

-31137.117565

1.0 1.0

M2:Positive

selection

ω0 = 0.13362, f0 = 0.70483, ω1

= 1.0 f1 = 0.25082, ω2 = 1.0, f2

= 0.04435

-31137.117565

M7: p = 0.47224, q = 1.0128 -31066.759798 1*10-6

5⁎10-6

M8: p0 = 0.9666, p = 0.5499, q =

1.358, (p1 = 0.03334) ω =

1.6429 (232T, 1414S, 1436S

NEB, 1414S BEB)

-31053.422533

M8a: p0 = 0.86864, p = 0.63251, q =

2.1528, (p1 = 0.13136) ω = 1.0

-31058.603827 0.001 0.005

3.11.3 Episodic positive selection across KIF14 mammalian phylogeny

The transient or episodic imprint of positive selection on KIF14 protein coding gene

that affects only subset of lineages and fraction of sites is unable to identify by codon

substitutions site models (M2 and M8). The transient or episodic positive selection on

various evolutionary stages of KIF14 mammalian phylogeny was determined by

Page 108: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

88

branch-site model (Table 3.29). The significance of the transient imprint of adaptive

evolution was determined by likelihood ratio tests (LRTs) of null model (similar to

branch-site model except ω2 is restricted to one for the predefined lineage of interest)

against branch-site model from log likelihood score for each test (Table 3.29). False

positive results obtained by branch-site model were eliminated by estimating the false

discovery rate q value amendment over p value. The ω2 and LRTs values for

predefined foreground branches revealed that only homininae ancestral branch evolved

significantly under positive selection with ω value 123.22 (Table 3.29). Bayes

empirical Bayes (BEB) method identified one codon site with posterior probability >

95%of evolving under positive selection in homininae ancestral branch (Table 3.29).

Table 3. 29: Branch-site analysis of KIF14.

Branch ω2 LRT P value q value Positive selected

sites

Human 1 0 1 0.98

Hominini 1 0 1 0.98

Homininae 123.22 11.0705 0.0009 0.04 619M⁎

Hominidae 1 0 1 0.98

Hominidea 1 0 1 0.98

Catarrhini 3.34 0.4272 0.51 0.98

Simians 1 0.000002 1 0.98

Haplorhini 1 0 1 0.98 Primates 1 0.0679 0.79 0.98

3.11.4 Site-specific functional divergence among the partitions of KIF14

mammalian phylogeny

Changes in site specific divergent selective pressure between the clades of protein

coding genes contribute to adaptive phenotypic diversity. To check the site specific

divergence in selective constraint between the different partitions of KIF14

mammalian phylogeny, clade model C (CmC) using codon based maximum likelihood

approach was performed (Table 3.30). Likelihood ratio tests (LRTs) of null model

M2a_rel against CmC-primates, simians, catarrhini, great apes, and hominini were

conducted in order to check the significance of divergence in selective pressure

between them (Table 3.30). Parameter estimation and LRTs revealed that KIF14

evolved with divergent selective pressure in hominini (ω = 0.25) and nonhominini

placental mammals (ω = 0.025) (Table 3.30). But when false positive results

eliminated by q values correction over p values, the result of divergent selective

constraint between hominini and nonhominini eutherian are not significant (Table

Page 109: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

89

3.30). This suggests that KIF14 protein coding gene have conserved function

throughout the evolution of eutherian animals.

Table 3. 30: Divergent selection constraint parameters estimation and likelihood scores for KIF14.

Model & Partition Parameter estimation (ω) Log likelihood

value (lnL)

P value q value

CmC-Primate p0 = 0.43, ω0 = 0.043, p1= 0.15, ω1 = 1.0, p2 = 0.42, ωnp = 0.36, ωp = 0.39

-31056.0434 0.58 0.66

CmC-simians p0 = 0.43, ω0 = 0.043, p1= 0.14, ω1 =

1.0, p2 = 0.42, ωns = 0.36, ωs = 0.42

-31055.4805 0.23 0.57

CmC-catarrhini p0 = 0.43, ω0 = 0.042, p1= 0.15, ω1 =

1.0, p2 = 0.42, ωnc = 0.36, ωc = 0.41

-31055.9687 0.50 0.66

CmC-greatapes p0 = 0.42, ω0 = 0.37, p1= 0.15, ω1 =

1.0, p2 = 0.43, ωng = 0.042, ωg = 0.13

-31054.6979 0.083 0.42

CmC-hominini p0 = 0.42, ω0 = 0.37, p1= 0.15, ω1 =

1.0, p2 = 0.43, ωnh = 0.042, ωh = 0.25

-31053.9072 0.03 0.30

M2a_rel p0 = 0.42, ω0 = 0.37, p1 = 0.15, ω1 =

1.0, p2 = 0.43, ω2 = 0.043

-31056.1959 NA NA

np: nonprimate eutherian, p: primates, ns: nonsimians eutherian, s: simians, nc: noncatarrhini, c:

catarrhini, ng: nongreatapes eutherian, g: greatapes, nh: nonhominini eutherian, h: hominini.

3.12 Synuclein gene family

3.12.1 Evolutionary history of synuclein family

Evolutionary relationship between α synuclein and its putative paralogs β synuclein

and γ synuclein was estimated by comprehending the protein sequences from

representative members of subphylum vertebrata Class Mammalia, Aves, Reptilia,

Amphibia, Osteichthyes, Chondrichthyes, and Agnathas through Maximum Likelihood

(ML) method (Figure 3.15). The molecular phylogenetic investigation suggests that

synuclein family has been diversified by two independent gene duplication events

much earlier in vertebrate history (Figure 3.15). The γ synuclein was the first gene to

diverge at the root of vertebrates prior to jawless-jawed vertebrates split (Figure 3.15).

However, α and β synuclein originated through a subsequent duplication event that

occurred after jawless-jawed vertebrates split and prior to cartilaginous-bony

vertebrates divergence (Figure 3.15). Furthermore, position of lizard in β synuclein

subfamily is not according to the well-established vertebrate phylogeny (Figure 5.15).

However, position and branch length of lizard β synuclein indicating rapid sequence

evolution of β synuclein in lizard as compare to other ortholog (Figure 3.15).

Phylogeny further revealed species specific duplication in lamprey and having two

copies of γ synuclein (Figure 3.15). Bidirectional blast/blat best hit strategy was unable

to detect any ortholog of synuclein family among all phyla of invertebrate‘s metazoan

(Figure 3.15). The vertebrate specific origin and their localization at presynaptic

Page 110: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

90

terminals suggest that synuclein family might have contributed towards synaptic

complexity differences between invertebrates and vertebrates.

Figure 3. 15: Evolutionary history of synuclein family.

Evolutionary history of synuclein gene family was inferred through maximum likelihood (ML) method.

The statistics present on the nodes depicts bootstrap value. Values ≥50% are displayed here. Two gene

duplications in the early vertebrate lineage diversify this family into three members α, β, and γ

synuclein. First duplication occurred before lamprey divergence from other vertebrates, while second duplication transpired after lamprey split from other vertebrates.

Page 111: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

91

3.12.2 Sequence evolution and Coevolutionary relationship

Vertebrate specific gene innovation and duplications facilitate the acquisition of

unique biological function and are hence considered major driving force behind

synaptic evolution (Bayés et al., 2017). Gene duplications provide a substantial raw

substrate from which a new gene function may evolve by mutations. Protein sequence

alignment of human synuclein paralogs reveals multiple regions of conservation and

divergence, with high sequence identity at the amino terminal domain, but low

sequence identity (α/γ) at the carboxyl terminal domain. Both α and β synuclein

contain two Proline rich Ca2+

binding motifs within the carboxyl terminal domain,,

while γ synuclein lack these subfamily specific motifs (Figure 3.16) (M. S. Nielsen,

Vorum, Lindersson, & Jensen, 2001). Within NAC domain, the most striking

difference among paralogs is the deletion of eleven amino acid residues in β synuclein,

a region responsible for amyloidogenic characteristic of α synuclein (Figure 3.16). As

mentioned above amino terminal region is highly conserved among paralogs. This

finding was further reinforced by the physical positioning of previously reported six

human specific mutations linked with hereditary Parkinson‘s disorder i.e., A30P,

E46K, H50Q, G51D, A53E and A53T on human α synuclein (Figure 3.16) (Kruger et

al., 1998; Lesage et al., 2013; Polymeropoulos et al., 1997; Silke et al., 2013; Zarranz

et al., 2004). Results revealed the confinement of these mutations explicitly towards

the amino terminal domain which in turn implies the significance of high conservation

of this region not only with functional perspective but also for pathogenesis of familial

Parkinson‘s disease (Figure 3.16). With the help of SLAC-window analysis, it appears

that amino terminal and NAC domains of α synuclein contained of 25 negatively

constrained sites which further advocates that strong purifying selection are operating

their role in preserving this region during vertebrate evolution (Table 3.31).

These sequence differences must necessarily underlie the functional and pathogenic

differences among paralogs and orthologs. However, the mechanism by which

sequence alteration lead to differential phenotypes among paralogs remains unclear.

Intriguingly, in α synuclein Q50H is the PD hotspot (when histidine gets mutated to

glutamine, PD is caused). In contrast the wild type β and γ synucleins contain

Glutamine at this position (Figure 3.16). This differential phenotypic impact of

synuclein paralogous copies likely to have arisen either independently by the effects of

protective alleles or coevolved with other proteins.

Page 112: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

92

Mutual information (MI) score can be used to predict the coevolutionary relationship

between amino acid residues in a protein family or subfamily (Teppa, Wilkins,

Nielsen, & Buslje, 2012). The two or more residues are suggestive to be coevolved, if

they have high MI signals. These residues likely to decipher biological information

related to protein structure and functions (Teppa, et al., 2012). The extent of

coevolutionary relationship between residues within α, β and γ synuclein has been

inferred by MISTIC server (Figure 3.17) (Simonetti, et al., 2013). Figure 3.17 shows

that the most conserved positions are proline and glycine residues in α and β synuclein.

In γ synuclein, glycine residues are most conserved residues. Furthermore, it can be

revealed that information accumulated in the carboxyl terminal domain of α and γ

synuclein: residues 101-114 (α synuclein) and 98-126 (γ synuclein) (Figure 3.17).

Within these regions, large number of MI connections (red lines represent MI values

with top 5% percentile) with high values of proximity mutual information (pMI) and

cumulative mutual information (cMI) for individual residue were found (Figure 3.17).

High MI values suggested coevolutionary relationship between residues. Whereas in β

synuclein large number of MI connections with high value of cMI and pMI was

observed in three main regions of amino, NAC and carboxyl domains: residues 10-18,

47-94 and 98-111. Interestingly, high degree of coevolutionary relationship between

residues of β synuclein (especially in NAC domain residue: 63-94) was observed as

compared to α and γ synuclein (Figure 3.17). These observations suggest that high

degree of coevolution between the residues of β synuclein likely to have occurred due

to deletion of ten residues in NAC domain, ultimately in order to maintain the

structural stability and perhaps function of β synuclein.

Page 113: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

93

Figure 3. 16: Sequence alignment of human synuclein paralogs.

Sequence comparison of human synuclein paralogs revealed that eight substitutions have accumulated in the amino terminal and NAC domains of α synuclein, while four substitutions and eleven residues deletion have occurred in the amino terminal and NAC domains of β synuclein after second duplication. * show the hotspots of

neurodegenerative diseases. Paralogs specific changes are color coded. ND: Neurodegenerative disorders.

α-syn specific Amino terminal domain

β-syn specific NAC domain

Ca2+ binding motif Carboxyl terminal domain

* Mutations linked to ND disorders

A30P* E46K* H50Q*

Human_αsyn MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVLYVGSKTKEGVVH

Human_βsyn MDVFMKGLSMAKEGVVAAAEKTKQGVTEAAEKTKEGVLYVGSKTREGVVQ

Ancesteral_α/βsyn MDVFMKGLSMAKEGVVAAAEKTKQGVTEAAEKTKEGVLYVGSKTKEGVVQ

Human_γsyn MDVFKKGFSIAKEGVVGAVEKTKQGVTEAAEKTKEGVMYVGAKTKENVVQ

G51D* V70M* Human_αsyn GVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGFVKKDQL

Human_βsyn GVASVAEKTKEQASHLGGAVFS-----------GAGNIAAATGLVKREEF

Ancesteral_α/βsyn GVASVAEKTKEQASNVGGAVVSGVTAVAQKTVEGAGNIAAATGLVKKEEL

Human_γsyn SVTSVAEKTKEQANAVSEAVVSSVNTVATKTVEEAENIAVTSGVVRKEDL A53T/E* P123H* Human_αsyn GKN-----EEGAPQEGILEDMPVDPDNEAYEMPSEEGYQDYEPEA

Human_βsyn PTDLKPEEVAQEAAEEPLIEPLMEPEGESYEDPPQEEYQEYEPEA

Ancesteral_α/βsyn PKQ------EEEAAQEPLIEEMVEPEGESYEDPPQEEYQEYEPEA

Human_γsyn RP----SAPQQ--------------EGEASKEKEEVAEEAQSGGD

Page 114: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

94

Figure 3. 17: Coevolutionary relationship within synuclein genes.

Circular representation of coevolutionary relationship among the residues within each synuclein family member. Outer circle show the one letter amino acid code of human sequences. Second circle colored box depicts the conservation score (from red represent highest conservation score to cyan show lowest conservation score). The third

indicate cumulative mutual information score, whereas fourth circle showed proximity mutual information scores. Lines displayed in the center of depicts connection between

the residues with mutual information score greater than 6.5. Red lines represent MI values with top 5% percentile; black ones represent MI score between 95 and 70%, while

gray lines indicate last 70%.

Page 115: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

95

Table 3. 31: Sites under negative selection constraint in α synuclein among vertebrates alignment

with SLAC analysis.

Index Residue no dN-dS P-value

1 15 -3 0.03

2 18 -4 0.01

3 20 -4.272426 0.01

4 23 -7.122523 0.0009

5 30 -3 0.03

6 37 -5 0.004

7 39 -4.83201 0.02

8 41 -3 0.03

9 47 -4 0.01

10 49 -4 0.01

11 50 -4.177328 0.05

12 52 -3 0.03

13 62 -4.282012 0.01

14 65 -4.83201 0.01

15 67 -5 0.004

16 69 -6 0.001

17 72 -4 0.01

18 73 -5 0.004

19 75 -3 0.03

20 77 -4 0.01

21 78 -4 0.01

22 85 -3 0.03

23 86 -3 0.03

24 88 -2.894755 0.05

25 98 -7.248015 0.002

dN: non-synonymous substitutions per non-synonymous site, dS: synonymous substitutions per

synonymous site, significant negatively selected sites are presented here only with p value <= 0.05.

3.12.3 Structural evolution of α synuclein

To further inspect how sequence differences impact on structure, comparative

structural study was conducted. NMR structure of human α synuclein is available and

extracted from PDB (1XQ8) and used as a reference to modelled paralogous and

orthologous ancestral proteins structures of α synuclein by homology modelling

(Figure 3.18, 3.19). RMSD values were used to study the structural deviations (Figure

3.18, 3.19). Results revealed that β and γ synuclein structures are highly diverged from

α synuclein at amino terminal and NAC domain (Figure 3.18). Comparative ancestral

orthologous structural analysis of α synuclein suggests that structure of α synuclein has

passed through series of transitions to acquire its favored conformation (Figure 3.19).

Superimposed models of ancestral α synuclein and 1XQ8 revealed common deviated

region encompassed 32 to 58 of amino terminal lipid binding domain of α synuclein,

suggesting that region encompasses 32 to 58 amino acids of α synuclein is constantly

evolved at structural level during vertebrate evolution, despite of its high sequence

conservation. Intriguingly, all mutations that are involved in Parkinson‘s disease

Page 116: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

96

pathogenesis situated in the aforementioned constantly evolving region (32-58 amino

acids) of α synuclein which indicates that any alteration in this region will be

deleterious because of the strong selection and functional constraints imposed on it.

Superimposed mutant models with 1XQ8 identified major shifts toward lipid binding

domain in A30P and H50Q, whereas major change was observed in lipid binding and

NAC domains in case of E46K and A53T. Only G51D showed altered NAC region

only (Figure 3.20). All five mutant models were having highly deviated region from 32

to 58 in common. It can be postulated from this comparative structural analysis that the

primary effects and the role of these five α synuclein mutations in Parkinson‘s disease

pathogenesis can be different because of their differential structural morphologies.

Figure 3. 18: Structural deviation among synuclein paralogs.

Major structural shifts were observed in amino terminal lipid binding and NAC domains due to

paralogous specific substitutions. Deviated residues in comparison with human α synuclein (1XQ8) are

color coded. Structural deviations were evaluated by RMSD values. SNCA: α synuclein, SNCB: β

synuclein, SNCG: γ synuclein.

Page 117: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

97

Figure 3. 19: Structural evolution of α synuclein protein. since the split from last common

sarcopterygian ancestor.

Significant structural divergence towards human α synuclein was observed after the split of common

sarcopterygian ancestor. Lineage specific substitutions are written above the arrow. Deviated residues

in terms of backbone torsion angles (Φ◦,Ψ◦) from the human α synuclein (1XQ8) are represented in red

color. Structural deviations were examined by RMSD values.

Figure 3. 20: Structural analysis of mutant models of human α synuclein.

Human specific mutations involved in FPD are red color coded. NMR structure of α synuclein was

obtained from PDB (1XQ8). Overall quality factor is expressed as percentage of the protein for which the calculated error value falls below the 95% rejection limit, calculated by Errat.

Page 118: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

98

3.12.4 Divergent selective constraint among synuclein genes

Sequence and structure variations, followed by functional changes among synuclein

paralogs might have contributed in the evolution of physiological pathways and

lineage specific phenotypic traits (Kaessmann, 2010).

Two patterns of amino acid sequence variations are considered an evidence of protein

functional divergence (Gu & Vander Velden, 2002). First, type 1 functional

divergence refers to an evolutionary process that describes alteration in functional

constraints among duplicated genes (Gu, 2001). Second, type 2 functional divergence

refers to an evolutionary process that represents no change in functional constraint

among duplicated genes, but radical change occur among them after gene duplication

(Gu, 2001). Several statistical methods have been developed to detect difference in site

specific selective constraint among duplicated genes (Bielawski & Yang, 2004; Gu &

Vander Velden, 2002). The extent of synucleins functional divergence is illustrated by

clade model D which assumes two or three site classes (k2 and k3), and that allow a

proportion of sites to undergo divergent selection pressures in two or more clades

(Table 3.32 and Table 3.33) (Bielawski & Yang, 2004; Z. Yang, 2007). These sites

may have any ω value that suggests differential selection pressure in different

clade. Clade model D was compared to the null model, Discrete model (M3 with k2

and k3). Discrete model (M3 with k2 and k3) along with clade model (model D with

k2 and k3) indicate significant divergent selective pressure and heterogeneity among

sites (Table and Table 2). Clade model with k3 site classes proposed 9%, 15% and 8%

of sites evolving under divergent selective pressure with strong purifying selection in

clade α and β (ωα = 0.187, ωβ = 0.177) and with positive selection in clade γ (ωγ =

1.90) respectively (Table 3.32 and Table 3.33). Similarly, clade model with k2 sites

suggests divergent selective pressure with ω = 0.206 for α clade, ω = 0.16 for β clade,

and ω = 0.746 for γ clade (Table 3.32). Furthermore, eight sites have been seen to

evolved under divergent selective pressure in synuclein paralogs (42S, 72T, 102K,

112I, 113L, 125Y, 128P, and 130E); human α synuclein used as a reference for

position number and residue) with a posterior probability >= 95%. These sites are

distributed non-randomly within synuclein domains. Out of eight, six sites were

located on the carboxyl terminal domain, responsible for chaperone like activity.

Page 119: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

99

Type 1 functional divergence (site specific evolutionary rate shift) is also observed

among α, β, and γ synuclein by using DIVERGE package (Table 3.34) (Gu & Vander

Velden, 2002) Phylogenetic tree, coupled with clade models and DIVERGE suggest

that γ synuclein is the ancient and most variable paralog in the synuclein family.

Table 3. 32: Parameter estimation and likehood score for synuclein family to detect functional

divergence.

Model Parameter estimation (ω) Log likelihood

value

M0: One ratio ω = 0.079 -3228.9759

Site-specific models

M3: Discrete (k = 2) ω0 = 0.027, f0 = 0.75, ω1 = 0.309, f1= 0.246 -3151.7289

M3: Discrete (k = 3) ω0 = 0.0226, f0 = 0.69, ω1 = 0.168, f1= 0.192, ω2 = 0.472,

f2 = 0.113

-3148.9903

Branch-site models

Model D (k = 2)

sncα ω0 = 0.028, f0 = 0.756, ω1sncα = 0.206, ω1sncβγ = 0.370, f1=

0.243

-3149.7281

sncβ ω0 = 0.029, f0 = 0.763, ω1sncβ = 0.160, ω1sncαγ = 0.417, f1=

0.237

-3146.6509

sncγ ω0 = 0.032, f0 = 0.778, ω1sncγ = 0.746, ω1sncαβ = 0.191, f1=

0.222

-3140.1041

Model D (k = 3)

sncα ω0 = 0.0235, f0 = 0.70, ω1 = 0.1859, f1= 0.21 -3144.1261

ω2sncα = 0.187, ω2sncβγ = 0.71, f2 = 0.094

sncβ ω0 = 0.0225, f0 = 0.67, ω1 = 0.136, f1= 0.17 -3144.0289

ω2sncβ = 0.177, ω2sncαγ = 0.542, f2 = 0.158

sncγ ω0 = 0.0251, f0 = 0.716, ω1 = 0.226, f1= 0.20 -3130.30517

ω2sncγ = 1.90, ω2sncαβ = 0.129, f2 = 0.081

f: Proportion of sites, k: site categories, ω values in bold shows positive selection.

Page 120: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 3 Results

100

Table 3. 33: Statistical significance of functional divergence among synuclein family.

Test 2δ df p value

LRT for sites model

M0 / M3 (K = 2) 154.5 2 0

M0 / M3 (K = 3) 159.9 4 0

M3 (K = 2) / M3 (K = 3) 5.48 2 0.06

Model D (k = 2) / Model D (k = 3)

sncα 11.204 2 0.003

sncβ 5.24 2 0.07

sncγ 19.59 2 <0.0001

LRT for sites and branch model

M3 (K = 2) / Model D (K = 2)

sncα 4.0016 2 0.1

sncβ 10.156 2 0.006

sncγ 23.25 2 <0.0001

M3 (K = 3) / Model D (K = 3)

sncα 9.73 2 0.007

sncβ 9.92 2 0.007

sncγ 37 2 <0.00001

M3 (K = 2) / Model D (K = 3)

sncα 15.21 4 0.004

sncβ 15.4 4 0.0039

sncγ 42.85 4 <0.000001

df: degree of freedom, δ: Difference between LRT values, p value ≤ 0.05 shows significant

divergence.

Table 3. 34: Type 1 functional divergence of synuclein family.

Comparison θ±SE Z score LRT P value(Z-score)

SNCα/SNCβ 0.807998±0.27 3.19 5.27127 0.001

SNCα/SNCγ 0.6622±0.258 2.76 2.934 0.005

SNCβ/SNCγ 0.994±0.22 5.1067 15.325 <0.00001

SNCαβ/SNCγ 0.66019±0.23 3.13 7.6 0.001

SE: standard error, θ: coefficient of functional divergence, LRT: likelihood ratio test.

Page 121: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 4 Discussion

101

Discussion

Over the last 60-70 million years of evolution have transformed most part of the brain

in both size and complexity, the hominin neocortex size has significantly enlarged in

short period of time since the divergence of Pan lineage from human approximately 6-

7 million years ago (McHenry, 1994). Although overall expansion in neocortex size is

occurred prior to anatomically modern human split from archaic hominins the

Neandertals and Denisovans approximately 550,000-750,000 years ago as both

modern human and Neandertals exhibit large comparable brain (Florio, Borrell, &

Huttner, 2017; Prüfer, et al., 2014). However, evident relative lobe size difference

exist between anatomically modern humans and Neandertals, most prominently

parieto-temporal lobe of neocortex has increased and orbitofrontal cortex is wider in

modern human as compared to Neandertals (Bastir, et al., 2011; Florio, et al., 2017).

This indicates that certain neocortical regions have evolved after Neandertals and

anatomically modern humans split. The size of neocortex is predominantly

determined by the magnitude of neurogenesis and cytokinesis during fetal

development. During neurogenesis, cortical neurons originate from progenitor cell in

the ventricular zone of the developing brain. The progenitor cells undergo successive

cycles of proliferative division before entering to neurogenic division and formation of

subventricular zone (Bystron, Blakemore, & Rakic, 2008; Pasko Rakic, 1988, 1995;

Stancik, Navarro-Quiroga, Sellke, & Haydar, 2010). Massive expansion in neocortex

size during evolution has been explained prominently by radial unit hypothesis of

cortical development. The radial unit hypothesis propose a general mechanism for

rapidly increases the neocortical surface area during evolution is owing to prolong

proliferative/symmetric division period and yields increase number of radial columnar

units that ultimately generate neurons and consequently expanded the neocortical

surface area (P Rakic, 2000). Alternative hypothesis, intermediate progenitor model

proposed that expansion in neocortical surface area and folding occurred during

evolution due to increase in basal progenitor pool size (BP originate from apical radial

glia the main neural progenitor cells in ventricular zone) and their subsequent

expansion in subventricular zone as compared to radial unit in ventricular zone

(Kriegstein, Noctor, & Martínez-Cerdeño, 2006). Recently, Nonaka‐Kinoshita et al.,

suggested another hypothesis for cortical expansion and folding and proposed that

increased abundance of basal radial galia (bRG exhibit stem cell properties) and outer

Page 122: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 4 Discussion

102

subventricular expansion are primarily responsible for neocortical surface area

expansion and gyrencephaly (Nonaka‐Kinoshita et al., 2013). However, during

neurogenesis BP is better suited for quantitative expansion of neuron production as

compared to apical radial glia because they are not under constraint that imposed on

apical radial glia proliferation by the limited ventricular space. So, increase in BP

generation and their proliferation in subventricular zone (inner and outer) are key

determinants in the evolutionary expansion of neocortex size. Though the timing of

brain development is conserved across mammals but species specific differences in the

duration of cortical neurogenesis (6 days in mice, 60 days in macaque, and 100 days in

humans) most likely contributed to the distinction of neocortex size and complexity

throughout the lineage from primate ancestor to modern humans (Finlay & Darlington,

1995; Geschwind & Rakic, 2013). During evolution, BP pool size abundance

differences and alteration in the timing of neocorticogenesis among species have some

genetic underpinnings likely to be based on lineage specific genomic changes and need

to decipher these changes by comparative genomics analysis.

The availability of whole genome sequence of many extinct and extant species, along

with advances in bioinformatics, molecular biology and comparative genomics

approaches, have ushered in an astonishing new era of human brain evolution (T. M.

Preuss, 2012). Despite the increased upswing in our understanding of the evolution of

the human genome, our awareness about the relationship between genetic changes and

phenotypic changes particularly the expansion of brain size is shaky. (O'Bleness,

Searles, Varki, Gagneux, & Sikela, 2012; T. M. Preuss, 2012). Three possible genetic

mechanisms have been proposed to explain brain/neocortex size differences between

humans and nonhuman primates. First hypothesis focuses on human specific gene loss

and duplication to explain the enlargement of human brain size during Pliocene-

Pleistocene epoch. Second, human specific changes in the regulatory regions of genes

have been proposed to be responsible for alteration in gene expression and ultimately

brain size. Third, human specific accelerated sequence evolution in nervous system

developmental protein coding genes are likely contributed to the rapid expansion of

human brain size during the period of last 5 million years.

ARHGAP11B gene encodes 267 amino acids long Rho GTPase-activating protein and

arose by partial duplication of ARHGAP11A gene after the divergence of human from

Pan lineage but prior to modern human and archaic hominin (Neandertals and

Page 123: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 4 Discussion

103

Denisovans) split, within the time window of 5 million years to 750,000 years ago

(Florio et al., 2015). ARHGAP11A is found throughout the metazoan while its

truncated paralog present only in hominin and losses RhoGAP activity after

duplication and prior to modern human and archaic human split (Florio, et al., 2015).

ARHGAP11B promoted BP generation from apical radial glia and their proliferation

in subventricular zone and also cause folding in mouse neocortex, while

ARHGAP11A did not effect on BP (Florio, et al., 2015; Florio, Namba, Pääbo, Hiller,

& Huttner, 2016). Therefore, hominin specific gene ARHGAP11B has been implicated

in increased neural progenitor proliferation and evolutionary expansion of neocortex

size in both modern humans and Neandertals (Florio, et al., 2015).

Figure 4. 1: Human neocortical cell types.

Schematic depiction of main neural progenitor cells that involved in neurons production in fetal human neocortex at mid neurogenesis. This is adapted from [(Florio, et al., 2016)].

However, another human specific duplicated gene SRGAP2 (SLIT-ROBO Rho

GTPase activating protein 2) gene has been implicated in cortical development.

SRGAP2 duplicated two times recently in humans after its divergence from

chimpanzee and produced SRGAP2B as result of partial duplication, and SRGAP2C

Page 124: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 4 Discussion

104

and SARGAP2D in subsequent duplication event from SRGAP2B (Dennis et al.,

2012). The timing of these two human specific duplications is between the windows of

3.5-1 million years ago. Human specific SRGAP2C gene led human specific neuronal

development features in mouse brain including augmented the density of longer spines

and neoteny during spine maturation (Charrier et al., 2012). Consolidated data suggest

SRGAP2C and ARHGAP11B genes transpired by human specific duplications

contributed to the human brain development and evolutionary expansion of human

neocortex size ultimately yield phenotypic differences between human and nonhuman

primates. Furthermore, human specific substitutions in the noncoding human

accelerated region 5 (HARE5) has been experimentally verified to increase the neural

progenitor cells and thus enhanced neocorticogenesis and exert immense difference in

the size of mice brain (Boyd et al., 2015). This noncoding regulatory region serves as

enhancer for FZD8 and contains sixteen human specific substitutions since its

divergence from the lineage leading to chimpanzee and bonobo approximately 6-7

million years ago. FZD8 was more abundant in human developing neocortex as

compared to macaque and suggested that human specific substitutions likely to

enhanced the expression of FZD8 cortical areas of neonatal human brain (Boyd, et al.,

2015).

Human specific evolutionary changes in protein coding genes might contribute to

phenotypic differences between human and nonhuman primates. Study on nervous

system development and housekeeping genes revealed that nervous system

developmental protein coding genes had accelerated evolution along the lineage from

primate ancestor to humans (Dorus et al., 2004). Furthermore, ADCYAP1 (adenylate-

cyclase-activating polypeptide 1) gene is highly conserved in primates and has been

involved in neural precursor amplification and also regulating the proliferative to

differentiated state transition during neurogenesis (Y. Wang et al., 2005). ADCYAP1

gene has been shown to exhibit signature of positive selection in human lineage after

the divergence from our closest extant relative chimpanzee and likely be contributed to

evolutionary changes in neocorticogenesis and might be responsible to expand the

magnitude of neocortex size in human (Y. Wang, et al., 2005). Positive selection

inferred if more number of nonsynonymous substitutions are accumulating faster than

synonymous substitutions in a protein coding gene.

Page 125: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 4 Discussion

105

Primary microcephaly protein coding genes are considered a key group of candidate

genes in relation to understand evolution of brain size because mutation in the coding

sequence of these gene cause severe reduction in brain size particularly cerebral

cortex. It is seems to be atavistic process because the brain size of primary

microcephalic patients is similar to that of nonhuman apes and early hominids. All

primary microcephaly genes expressed in neuroprogenitor or neuroepithelial cells

during early brain development and perform multiple seemingly unrelated functions

including DNA damage repair, centriole biogenesis, spindle organization, neuronal

differentiation and migration, chromosomal alignment and segregation, transport of

DHA across the blood brain barrier, cytokinesis, regulation of gene transcription and

controls the progenitor amplification (Faheem et al., 2015). Turning to evolutionary

pattern, previous studies highlight that genes involved to control the duration and

mode of cell division were targeted by positive selection during evolution (Bond, et

al., 2002; Evans, Vallender, & Lahn, 2006; S. Montgomery & Mundy, 2012; Y.-q.

Wang et al., 2005). Initial evolutionary studies revealed that four MCPH genes

(MCPH1, CDK5RAP2, ASPM and CENPJ) seem to be evolved adaptively in human

lineage since divergence from chimpanzee (Bond, et al., 2005). Latter as more

eutherian species incorporated into the evolutionary analysis, signature of episodic

positive selection extended beyond human to throughout the eutherian mammals in six

MCPH genes (ASPM, CDK5RAP2, MCPH1, CENPJ, CEP152, and WDR62) as a

pervasive positive selection (S. H. Montgomery & Mundy, 2014). In contrast, the

results of current study are not consistent with the above mentioned study of

Montgomery and Mundy; as all analyzed have no such pattern pervasive positive

selection across the eutherian mammals. It is not necessary that adaptive phenotype

result only if positive selection acting on protein coding genes, changes in site specific

divergent selective pressure between the clades of protein coding genes also

contributed to adaptive phenotypic diversity. Furthermore, the signatures of divergent

selection constraints between simians and nonsimians mammals are significant for

only two loci STIL and SASS6. There is an ample evidence to suggest majority of the

MCPH loci have maintained their conserve functions throughout the placental

mammals. Additionally, significant signatures of episodic selection were not found in

any of the ancestral branch analyzed from primates to hominini branch for MCPH loci

analyzed except KIF14 homininae ancestral branch. However, STIL and WDR62 have

shown to exhibit pattern adaptive evolution in human but these patterns are not

Page 126: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 4 Discussion

106

significant by codon substitutions based method while significant by frequency based

method in human population. However, protein alignment showed that among all

analyzed MCPH genes, WDR62 and STIL have accumulated greater number of human

specific amino acid replacements after its divergence from chimpanzee (Table 3.1, 3.4

and Table 4.1).

Table 4. 1: Chimpanzee, hominin and human specific amino acids replacements in MCPH genes

since the divergence from hominini ancestor.

Gene Residue

number

Hominini

ancestor

Chimpanzee Denisovans Neandertals Human

CEP135 581 I V I I I

691 R K R R R

844 A S A A A

936 I L I I I

ZNF335 83 G G S S S

294 T T S S S

359 R R P R R

384 P P R P P

403 M L M M M

770 P P S S S

856 A V A A A

1317 E E D D D

PHC1 103 I M I I I

518 T T A A A

SASS6 443 V A V V V

MFSD2A 276 A A S S S

290 S R S S S

415 Q L Q Q Q

CIT 13 D D E D D

78 R R W W R

229 I I V V V

331 T S T T T

332 S G S S S

338 I V I I I

KIF14 73 K K R R R

204 S N S S S

208 E Q E E E

289 P P R R R

321 F L F F F

330 A A T A A

339 E Q E E E

Page 127: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 4 Discussion

107

395 M M T T T

605 A A S A A

637 I V I I I

733 N N S S S

1081 M M V V V

1165 V V A A A

1315 E E E G E

1361 S S L L L

1363 I T I I I

1391 L Q L L L

1403 N N N H N

1408 S G S S S

1543 S N S S S

1624 R R R H R

Most of these changes shared with archaic humans Neandertals and Denisovans.

Indeed, three MCPH genes (CDK6, CEP135 and SASS6) have no human specific

amino acid substitutions after its divergence from hominini ancestor (Table 4.1).

Among all MCPH genes, CDK6 is most conserved gene in mammals. As pointed out

in current and previous evolutionary studies on MCPH genes have shown that the

coding sequences of majority of those genes that contained human specific

substitutions have experienced positive selection in different time periods during

eutherian evolution rather than being specific to human (S. H. Montgomery & Mundy,

2014; S. Xu et al., 2017). Majority of primry microcephaly causing mutations in

MCPH genes are truncated mutations and perhaps lower the expression level of

normal protein in primary microcephaly, suggesting that extent of the expression of

MCPH genes might have been important for expansion of brain size. The data

demonstrate evolutionary enlargement in the magnitude of human brain during the last

two million years might have not related to the coding sequences of human

microcephaly genes only. However, transcriptional and posttranslational changes with

the combination of human specific changes in MCPH genes might have been

responsible for the evolutionary expansion of human brain size after its divergence

from australopithecus, as coding and noncoding regulatory changes amend the

functional impact of each other (Dimas et al., 2008). Therefore, the complex

conditional effects of human specific coding and noncoding changes in MCPH loci

Page 128: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 4 Discussion

108

may therefore have paramount consequences for human brain size expansion during

Pliocene-Pleistocene age.

Evolutionary expansion in the magnitude of human brain size is mediated by a

functional trade-off between higher cognitive capabilities and susceptible to

neurodegenerative disorders. Parkinson‘s disease is the second common

neurodegenerative disorder after Alzheimer‘s disease. Both these disorders are

considered specific to human as no other animal species naturally affect either from

Parkinson‘s and Alzheimer‘s disorder. Neurodegenerative disorders were seemed to be

affecting those regions of brain which are evolved during the recent history of human

evolution. Human neurodegenerative genes are evolutionary conserved and are strong

selection constraint as compared to non-neurodegenerative genes (Panda, Begum, &

Ghosh, 2012). Alpha synuclein is highly penetrant gene in early onset of hereditary

Parkinson‘s disease, while paralogs of human α synuclein (β and γ) are not associated

with Parkinson‘s disease. All synuclein members are under strong purifying selection

in sarcopterygians. Gamma synuclein is the ancient and most functionally diverged

gene among three synuclein. Expression data rendered evidence in the favor of

functional divergence among synuclein paralogs, as α synuclein is abundantly present

in catecholaminergic regions while β synuclein expression is weak or absent in these

regions and abundant in somatic cholinergic regions (J.-Y. Li, Jensen, & Dahlström,

2002). The most ancient duplicated paralog of this family, γ synuclein appears to be

localized in both catecholaminergic and somatic cholinergic regions but also has

differential expression pattern in a selected population of the peripheral and central

neuron (J.-Y. Li, et al., 2002; Ninkina et al., 2003). However, α and β synuclein are

able to substitute each other in the auditory system, indicating their role in this system

is ancestrally derived (Mooney, 2009). Furthermore, researchers have identified

distinct function of α and γ synuclein in neuronal synapses to rescue the phenotype

developed by the ablation of a CSPα gene (Chandra, Gallardo, Fernández-Chacón,

Schlüter, & Südhof, 2005; Ninkina et al., 2012). This indicates that protein

diversification between α and γ synuclein (especially in the carboxyl terminal domain)

could have arisen after first duplication event. In spite of above observations, the high

degree of functional redundancy was observed among paralogs for synaptic structure

and terminal size, as well as age dependent neuronal dysfunctions in αβγ-Synuclein

triple knockout mice (Greten-Harrison et al., 2010). These phenotypic changes did not

Page 129: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 4 Discussion

109

appear in mice when one or two paralogs were deleted (Greten-Harrison, et al., 2010).

This suggests that in addition to ancestral function, each paralog could have

substantially acquired paralog specific function that cannot be compensated by the

other members that is sufficient to explain why functions of mutated α synuclein are

not compensated by two other paralog in PD patients. The mechanism by which

mutated α synuclein induces pathogenesis invokes multiple pathways such as self-

aggregation, and act as chaperon protein that are known to affect substantial nigral

neuronal viability (Figure 4.2a, b and e).

Figure 4. 2: Schematic overview of neurodegenerative (a, b,e) and neuroprotective (c,d) role of

alpha synuclein.

α synuclein protein play a dual role in nervous system. a) Abberent α synuclein interact with 14-3-3

proteins and forms aggregates. As a result of this interaction, pro-apoptotic proteins are translocated

into mitochondria and inhibit anti-apoptotic activity of Bcl2 and Bcl-xL and initiate caspase-9

dependent neuronal death by releasing cytochrome c. b) Binding of α synuclein with dopamine

generates ROS which increasing mitochondrial membrane permiability and activate caspase dependent

neuronal apoptosis by releasing cytochrome c. c) Wild type α synuclein protect dopaminergic neuron from apoptosis by binding with pro-apoptotic proteins and inhibit their translocation into mitochondria.

d) Wild type cytoplasmic α synuclein inhibit p300 and NFkB-p65 acetyltion by inhibiting histone acetyl-

transferase activity of p300 which leads to interuption in their binding to promoter region, ultimateley

inhibit transcription of PKCδ and other pro-apoptotic proteins. e) Mutant α synuclein induce ER stress

which promote neuronal apoptosis either by Ca2+ions or by pro-apoptotic proteins. ROS: Reactive

oxygen specie, M/P/O: Mutations/post translational modifications/overexpression, ER: Endoplasmic

reticulumn, P: Phosphorylation, Ac: Acetylation.

Alpha synuclein is traditionally considered as intrinsically disordered protein; however

on binding to its target it undergoes transitions to more ordered alpha-helical

conformation (Dikiy & Eliezer, 2012; Siddiqui, et al., 2016). Evolutionary studies

Page 130: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 4 Discussion

110

suggest that α synuclein has attained its intrinsic disordered conformation through a

series of transitions in NAC and carboxyl-terminus acidic domain which actually

regulate the structural dynamics of small region of amino-terminus lipid binding

domain 32-58 (critical region) by epistatic effect (Siddiqui, et al., 2016). Generally,

intrinsically disordered proteins exhibit a specific amino acid sequence that develops

long range interactions within protein to prevent aggregation (Kokhan, Van‘kin,

Bachurin, & Shamakina, 2013). In case of α synuclein, GAV motif present in NAC

domain is considered to be the most aggregation prone region but it is partially

protected by long range interactions between domains through positively and

negatively charged residues of amino and carboxyl termini (Du, et al., 2003; Lashuel,

Overk, Oueslati, & Masliah, 2012; Siddiqui, et al., 2016; Uverskya & Finka, 2002).

Recent study revealed that disease associated mutations (all present in critical

region:32-58) altered the lipid binding and NAC domain dynamics (Siddiqui, et al.,

2016). Therefore, it can be speculated that these pathogenic mutations might disturb

the long range interactions between synuclein domains and thus increase the

propensity of self-assembly to aggregate into neurotoxic oligomer. These neurotoxic

oligomers induce endoplasmic reticulum stress which initiates neuronal cell death in

three distinct ways; first by releasing Ca+ that increase the permeability of

mitochondrial membrane and activate mitochondrial neuronal death pathway by

releasing cytochrome c (Figure 4.2e) (Hald & Lotharius, 2005; W. W. Smith, 2005).

Secondly, ER stress increases the maturation of proapoptotic protein (Bad) by

cleavage, which gets translocated into mitochondria and inhibits antiapoptotic activity

of Bcl-xL and Bcl2, ultimately activating the cytochrome c dependent mitochondrial

apoptotic pathway (Figure 4.2e)(W. W. Smith, 2005). Third pathway is mitochondrial

independent by activating caspase-12 which directly activates caspase-3 to induce

neuronal death (Figure 4.2e) (W. W. Smith, 2005).

It has been recently reported that 54-84 kDa protein complex of α synuclein and 14-3-

3 chaperon protein is present in the substantia nigra of PD patients (Binolfi et al.,

2008). α synuclein interacts with 14-3-3 protein through critical region. This suggests

that pathogenic mechanism in PD is mediated by interaction between α synuclein and

14-3-3. This complex reduces the anti-apoptotic activity of 14-3-3 protein and

promotes neuronal apoptosis by inhibiting interaction of 14-3-3 with proapoptotic

protein such as Bad and Bax (Figure 4.2a)(Berg, Holzmann, & Riess, 2003). Mutations

Page 131: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 4 Discussion

111

and post-translational modifications might increase interaction propensity of α

synuclein towards 14-3-3 protein which might explain the presence of this complex in

PD. Additionally, 14-3-3 and α synuclein are also important in dopamine synthesis

(Berg, et al., 2003; Sidhu, Wersinger, MOUSSA, & Vernier, 2004). Normally, 14-3-3

binds to phosphorylated tyrosine hydrolase and enhances the activity of tyrosine

hydrolase and subsequent dopamine production. On the other hand α synuclein reduces

the activity of tyrosine hydrolase and dopamine production through binding with

dephosphorylated tyrosine hydrolase (Berg, et al., 2003).

The interaction between α synuclein and dopamine plays very important role in the

production of cytotoxic species and hence pathogenesis of PD because auto oxidation

of dopamine is compulsory for this interaction (Bisaglia et al., 2010; Chan et al.,

2012). The amino-terminal critical region of α synuclein forms the interface for

dopamine dependent oligomerisation of α synuclein (Leong et al., 2015). Dopamine

stabilizes the neurotoxic oligomeric form of α synuclein (Cookson et al., 2009; Lee et

al., 2011). Additionally, ROS can cause oxidative stress and alter the function of

protein, DNA and lipids, resulting in mitochondrial impairment and ultimately

increasing the neuronal vulnerability. The mutated α synuclein (especially A53T) has a

greater tendency to trigger the dopamine dependent neurotoxicity at low concentration

than wild type (Pan, Bruening, Giasson, Lee, & Godwin, 2002). As all these mutations

are residing in the critical region which is important for dopamine interaction, might

disrupt the structural integrity and perhaps increase its propensity towards dopamine

by forming neurotoxic adduct α synuclein: dopamine quinone (Leong, et al., 2015;

Siddiqui, et al., 2016). This suggests that dopamine induces pathogenicity in two ways,

firstly, by forming neurotoxic adduct with α synuclein, and secondly, by producing

cytotoxic ROS species which promotes neuronal death by cytochrome c dependent

caspase activation in cytosol (Figure 4.2b).

At physiological condition wild type α synuclein is considered to be involved in

antiapoptotic and/or neuroprotective phenotype (da Costa, Paitel, Vincent, & Checler,

2002). Physiological concentration of wild type α synuclein was found to protect non-

differentiated brain dopaminergic cells, cortical and hippocampal neurons against

neurotoxicity induced either by oxidative stress, rotenone and 1-methyl-4-

phenylpyridinium (MPP+) (da Costa, Ancolio, & Checler, 2000; Sidhu, et al., 2004).

Recently, the neuroprotective role of native wild type α synuclein has been reported

Page 132: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 4 Discussion

112

against MPP+ and retenone toxicity by modulating the expression of Protein Kinase C

delta (PKCδ) in dopaminergic neurons (Kaul, Anantharam, Kanthasamy, &

Kanthasamy, 2005). Wild type α synuclein diminishes the PKCδ transcription to

inhibit apoptosis by down regulating the enzymatic activity of p300 protein which is

parallel to the loss of its corresponding histone acytyltransferase activity (HAT

activity), as a result inhibits the p300 mediated acetylation of NFkB-p65 (Arumugam

et al., 2014; Jin et al., 2011; Lanzillotta et al., 2015). As a consequence, NFkB and

p300 do not bind to PKCδ promoter region and generalized transcription machinery,

eventually inhibiting PKCδ transcription (Jin, et al., 2011).Therefore, downregulation

of PKCδ by α synuclein confers neuroprotection due to the reduced proteolytic

activation of PKCδ (Figure 4.2d). However, current study shows that PD associated

mutations in α synuclein not only alter its physio-chemical properties but also modify

its regulatory functions (Segura-Ulate, Yang, Vargas-Medrano, & Perez, 2017).

Furthermore, Physiological concentration of native α synuclein and 14-3-3

antiapoptotic protein together prevents degeneration of dopaminergic neurons by

inhibiting proapoptotic proteins translocation into mitochondria and hence block the

apoptosis of dopaminergic neuronal cells, while mutations disturbed the interaction of

α synuclein with proapoptotic protein (Figure 4.2c) (Berg, et al., 2003). Wild type α

synuclein drastically inhibits p53 dependent caspase 3 activation and apoptosis by

reducing both p53 expression and transcriptional activity in non-dopaminergic neurons

(da Costa, et al., 2002). These two proteins regulate the expression of each other in a

feedback mechanism and explain a functional interplay driving their cellular

homeostasis in neurons (Duplan, Giordano, Checler, & Alves da Costa, 2016).

However, mutations in α synuclein disturb the homeostasis of both proteins and might

explain the neuronal death in PD brain.

Page 133: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

Chapter 5 Conclusion and Future Prospects

113

Conclusion and Future Prospects

This study demonstrates that almost all the analyzed primary microcephaly genes

maintain their conserve functions throughout the placental mammals except STIL and

SAAS6. Collectively, the data demonstrate that dramatic evolutionary expansion of

human brain size during Pliocene-Pleistocene period might have not concomitant to

the human specific substitutions in the coding sequences of human microcephaly genes

only. However, transcriptional and posttranslational changes with the combination of

human specific changes in MCPH genes might have been responsible for the

evolutionary expansion of human brain size during Pliocene-Pleistocene period as

coding and noncoding regulatory changes amend the functional impact of each other.

In future, cis-regulatory elements of MCPH loci will be identified by comparative

genomics approach because whose evolutionary patterns might provide the

underpinnings for the dramatic expansion of human brain size during the upper

Pliocene and Pleistocene age. Then functional testing in model organism will be

performed on those noncoding sequences which are adaptively evolved in human

lineage and also trying to elucidate the complex conditional effects of human specific

coding and noncoding changes in MCPH loci.

Amino terminal lipid binding domain region (32-58 amino acids) of) of α synuclein is

most critical region, not only for evolutionary perspective but also evidently

significant for the normal cellular function of α synuclein as well as in Parkinson‘s

disease pathogenesis. Alpha synuclein develops interactions through critical region

with variety of proteins which are involved in apoptosis and transcriptional regulation.

Mutations in α synuclein cause drastic structural shifts in amino terminal and NAC

domains and might alter its interaction propensity towards its interacting proteins and

dopamine, which ultimately induce pathogenesis. In future, more evolutionary analysis

study was conducted on all the identified Parkinson‘s associated genes to completely

understand whether and why Parkinsonism is specific to humans solely.

Page 134: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

114

References

Abdullah, U., Farooq, M., Mang, Y., Bakhtiar, S. M., Fatima, A., Hansen, L., et al.

(2017). A novel mutation in CDK5RAP2 gene causes primary microcephaly

with speech impairment and sparse eyebrows in a consanguineous Pakistani

family. European journal of medical genetics, 60(12), 627-630.

Abecasis, G. R., Auton, A., Brooks, L. D., DePristo, M. A., Durbin, R. M., Handsaker,

R. E., et al. (2012). An integrated map of genetic variation from 1,092 human

genomes. Nature, 491(7422), 56-65.

Al-Dosari, M. S., Shaheen, R., Colak, D., & Alkuraya, F. S. (2010). Novel CENPJ

mutation causes Seckel syndrome. Journal of medical genetics, 47(6), 411-414.

Alakbarzade, V., Hameed, A., Quek, D. Q., Chioza, B. A., Baple, E. L., Cazenave-

Gassiot, A., et al. (2015). A partially inactivating mutation in the sodium-

dependent lysophosphatidylcholine transporter MFSD2A causes a non-lethal

microcephaly syndrome. Nature genetics, 47(7), 814.

Alkema, M. J., Bronk, M., Verhoeven, E., Otte, A., van't Veer, L. J., Berns, A., et al.

(1997). Identification of Bmi1-interacting proteins as constituents of a

multimeric mammalian polycomb complex. Genes & development, 11(2), 226-

240.

Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic

Local Alignment Search Tool. Journal of Molecular Biology, 215(3), 403-410.

Amigo, J., Salas, A., Phillips, C., & Carracedo, A. (2008). SPSmart: adapting

population based SNP genotype databases for fast and comprehensive web

access. BMC bioinformatics, 9, 428.

Anisimova, M., & Kosiol, C. (2008). Investigating protein-coding sequence evolution

with probabilistic codon substitution models. Molecular biology and evolution,

26(2), 255-271.

Aplan, P. D., Lombardi, D. P., & Kirsch, I. R. (1991). Structural characterization of

SIL, a gene frequently disrupted in T-cell acute lymphoblastic leukemia.

Molecular and cellular biology, 11(11), 5462-5469.

Arquint, C., & Nigg, E. A. (2016). The PLK4–STIL–SAS-6 module at the core of

centriole duplication. Biochemical Society Transactions, 44(5), 1253-1263.

Arumugam, T. V., Liang, J., Luan, Y., Lu, B., Zhang, H., Luo, Y.-n., et al. (2014).

Protection of Ischemic Postconditioning against Neuronal Apoptosis Induced

by Transient Focal Ischemia Is Associated with Attenuation of NF-κB/p65

Activation. PloS one, 9(5), e96734.

Awad, S., Al-Dosari, M. S., Al-Yacoub, N., Colak, D., Salih, M. A., Alkuraya, F. S., et

al. (2013). Mutation in PHC1 implicates chromatin remodeling in primary

microcephaly pathogenesis. Human molecular genetics, 22(11), 2200-2213.

Barr, A. R., Kilmartin, J. V., & Gergely, F. (2010). CDK5RAP2 functions in

centrosome to spindle pole attachment and DNA damage response. The

Journal of cell biology, 189(1), 23-39.

Barrera, J. A., Kao, L.-R., Hammer, R. E., Seemann, J., Fuchs, J. L., & Megraw, T. L.

(2010). CDK5RAP2 regulates centriole engagement and cohesion in mice.

Developmental cell, 18(6), 913-926.

Basit, S., Al-Harbi, K. M., Alhijji, S. A., Albalawi, A. M., Alharby, E., Eldardear, A.,

et al. (2016). CIT, a gene involved in neurogenic cytokinesis, is mutated in

human primary microcephaly. Human genetics, 135(10), 1199-1207.

Page 135: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

115

Bastir, M., Rosas, A., Gunz, P., Peña-Melian, A., Manzi, G., Harvati, K., et al. (2011).

Evolution of the base of the brain in highly encephalized human species.

Nature Communications, 2, 588.

Basto, R., Lau, J., Vinogradova, T., Gardiol, A., Woods, C. G., Khodjakov, A., et al.

(2006). Flies without centrioles. Cell, 125(7), 1375-1386.

Bayés, À., Collins, M. O., Reig-Viader, R., Gou, G., Goulding, D., Izquierdo, A., et al.

(2017). Evolution of complexity in the zebrafish synapse proteome. [Article].

Nature Communications, 8, 14613.

Ben-Zvi, A., Lacoste, B., Kur, E., Andreone, B. J., Mayshar, Y., Yan, H., et al. (2014).

Mfsd2a is critical for the formation and function of the blood–brain barrier.

Nature, 509(7501), 507.

Berg, D., Holzmann, C., & Riess, O. (2003). 14-3-3 proteins in the nervous system.

Nature Reviews Neuroscience, 4(9), 752-762.

Berger, J. H., Charron, M. J., & Silver, D. L. (2012). Major facilitator superfamily

domain-containing protein 2a (MFSD2A) has roles in body growth, motor

function, and lipid metabolism. PloS one, 7(11), e50629.

Betts, M. J., & Russell, R. B. (2003). Amino acid properties and consequences of

substitutions. Bioinformatics for geneticists, 317, 289-298.

Beukelaers, P., Vandenbosch, R., Caron, N., Nguyen, L., Belachew, S., Moonen, G., et

al. (2011). Cdk6‐Dependent Regulation of G1 Length Controls Adult

Neurogenesis. Stem cells, 29(4), 713-724.

Bielawski, J. P., & Yang, Z. (2004). A maximum likelihood method for detecting

functional divergence at individual codon sites, with application to gene family

evolution. Journal of Molecular Evolution, 59(1), 121-132.

Bilguvar, K., Ozturk, A. K., Louvi, A., Kwan, K. Y., Choi, M., Tatli, B., et al. (2010).

Whole-exome sequencing identifies recessive WDR62 mutations in severe

brain malformations. Nature, 467(7312), 207-U293.

Binolfi, A., Lamberto, G. R., Duran, R., Quintanar, L., Bertoncini, C. W., Souza, J. M.,

et al. (2008). Site-specific interactions of Cu (II) with α and β-synuclein:

bridging the molecular gap between metal binding and aggregation. Journal of

the American Chemical Society, 130(35), 11801-11812.

Bisaglia, M., Greggio, E., Maric, D., Miller, D. W., Cookson, M. R., & Bubacco, L.

(2010). α-Synuclein overexpression increases dopamine toxicity in BE(2)-M17

cells. BMC Neuroscience, 11(1), 41.

Bisaglia, M., Mammi, S., & Bubacco, L. (2009). Structural insights on physiological

functions and pathological effects of α-synuclein. The FASEB Journal, 23(2),

329-340.

Blachon, S., Gopalakrishnan, J., Omori, Y., Polyanovsky, A., Church, A., Nicastro, D.,

et al. (2008). Drosophila Asterless the ortholog of vertebrate Cep152 is

essential for centriole duplication. Genetics.

Bogoyevitch, M. A., Yeap, Y. Y. C., Qu, Z. D., Ngoei, K. R., Yip, Y. Y., Zhao, T. T.,

et al. (2012). WD40-repeat protein 62 is a JNK-phosphorylated spindle pole

protein required for spindle maintenance and timely mitotic progression.

Journal of Cell Science, 125(21), 5096-5109.

Bond, J., Roberts, E., Mochida, G. H., Hampshire, D. J., Scott, S., Askham, J. M., et

al. (2002). ASPM is a major determinant of cerebral cortical size. Nature

genetics, 32(2), 316.

Bond, J., Roberts, E., Springell, K., Lizarraga, S., Scott, S., Higgins, J., et al. (2005). A

centrosomal mechanism involving CDK5RAP2 and CENPJ controls brain size.

Nature genetics, 37(4), 353.

Page 136: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

116

Boyd, J. L., Skove, S. L., Rouanet, J. P., Pilaz, L.-J., Bepler, T., Gordân, R., et al.

(2015). Human-chimpanzee differences in a FZD8 enhancer alter cell-cycle

dynamics in the developing neocortex. Current Biology, 25(6), 772-779.

Bruner, E. (2010). Morphological differences in the parietal lobes within the human

genus: a neurofunctional perspective. Current Anthropology, 51(S1), S77-S88.

Brunet, M., Guy, F., Pilbeam, D., Mackaye, H. T., Likius, A., Ahounta, D., et al.

(2002). A new hominid from the Upper Miocene of Chad, Central Africa.

Nature, 418(6894), 145.

Bryant, K. L., & Preuss, T. M. (2018). A Comparative Perspective on the Human

Temporal Lobe. In E. Bruner, N. Ogihara & H. C. Tanabe (Eds.), Digital

Endocasts: From Skulls to Brains (pp. 239-258). Tokyo: Springer Japan.

Buslje, C. M., Santos, J., Delfino, J. M., & Nielsen, M. (2009). Correction for

phylogeny, small number of observations and data redundancy improves the

identification of coevolving amino acid pairs using mutual information.

Bioinformatics, 25(9), 1125-1131.

Buslje, C. M., Teppa, E., Di Doménico, T., Delfino, J. M., & Nielsen, M. (2010).

Networks of high mutual information define the structural proximity of

catalytic sites: implications for catalytic residue identification. PLoS

computational biology, 6(11), e1000978.

Bystron, I., Blakemore, C., & Rakic, P. (2008). Development of the human cerebral

cortex: Boulder Committee revisited. Nature Reviews Neuroscience, 9(2), 110.

Campaner, S., Kaldis, P., Izraeli, S., & Kirsch, I. R. (2005). Sil phosphorylation in a

Pin1 binding domain affects the duration of the spindle checkpoint. Molecular

and cellular biology, 25(15), 6660-6672.

Campion, D., Martin, C., Heilig, R., Charbonnier, F., Moreau, V., Flaman, J. M., et al.

(1995). The NACP/synuclein gene: chromosomal assignment and screening for

alterations in Alzheimer disease. Genomics, 26(2), 254-257.

Carroll, S. B. (2003). Genetics and the making of Homo sapiens. Nature, 422(6934),

849-857.

Castiel, A., Danieli, M. M., David, A., Moshkovitz, S., Aplan, P. D., Kirsch, I. R., et

al. (2011). The Stil protein regulates centrosome integrity and mitosis through

suppression of Chfr. Journal of Cell Science, jcs. 079731.

Chan, T., Chow, A. M., Cheng, X. R., Tang, D. W. F., Brown, I. R., & Kerman, K.

(2012). Oxidative Stress Effect of Dopamine on α-Synuclein: Electroanalysis

of Solvent Interactions. ACS Chemical Neuroscience, 3(7), 569-574.

Chandra, S., Gallardo, G., Fernández-Chacón, R., Schlüter, O. M., & Südhof, T. C.

(2005). α-Synuclein cooperates with CSPα in preventing neurodegeneration.

Cell, 123(3), 383-396.

Charrier, C., Joshi, K., Coutinho-Budd, J., Kim, J.-E., Lambert, N., De Marchena, J., et

al. (2012). Inhibition of SRGAP2 function by its human-specific paralogs

induces neoteny during spine maturation. Cell, 149(4), 923-935.

Chen, C.-Y., Olayioye, M. A., Lindeman, G. J., & Tang, T. K. (2006). CPAP interacts

with 14-3-3 in a cell cycle-dependent manner. Biochemical and biophysical

research communications, 342(4), 1203-1210.

Cho, J.-H., Chang, C.-J., Chen, C.-Y., & Tang, T. K. (2006). Depletion of CPAP by

RNAi disrupts centrosome integrity and induces multipolar spindles.

Biochemical and biophysical research communications, 339(3), 742-747.

Cmarko, D., Verschure, P. J., Otte, A. P., van Driel, R., & Fakan, S. (2003). Polycomb

group gene silencing proteins are concentrated in the perichromatin

compartment of the mammalian nucleus. Journal of Cell Science, 116(2), 335-

343.

Page 137: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

117

Cohen-Katsenelson, K., Wasserman, T., Khateb, S., Whitmarsh, A. J., & Aronheim, A.

(2011). Docking interactions of the JNK scaffold protein WDR62. Biochem J,

439(3), 381-390.

Cookson, M. R., Outeiro, T. F., Klucken, J., Bercury, K., Tetzlaff, J., Putcha, P., et al.

(2009). Dopamine-Induced Conformational Changes in Alpha-Synuclein. PloS

one, 4(9), e6906.

da Costa, C. A., Ancolio, K., & Checler, F. (2000). Wild-type but not Parkinson's

disease-related ala-53→ Thr mutant α-synuclein protects neuronal cells from

apoptotic stimuli. Journal of Biological Chemistry, 275(31), 24065-24069.

da Costa, C. A., Paitel, E., Vincent, B., & Checler, F. (2002). α-Synuclein Lowers p53-

dependent Apoptotic Response of Neuronal Cells ABOLISHMENT BY 6-

HYDROXYDOPAMINE AND IMPLICATION FOR PARKINSON′ S

DISEASE. Journal of Biological Chemistry, 277(52), 50980-50984.

Dennis, M. Y., Nuttle, X., Sudmant, P. H., Antonacci, F., Graves, T. A., Nefedov, M.,

et al. (2012). Evolution of human-specific neural SRGAP2 genes by

incomplete segmental duplication. Cell, 149(4), 912-922.

Desir, J., Cassart, M., David, P., Van Bogaert, P., & Abramowicz, M. (2008). Primary

microcephaly with ASPM mutation shows simplified cortical gyration with

antero‐posterior gradient pre‐and post‐natally. American journal of medical

genetics Part A, 146(11), 1439-1443.

Di Cunto, F., Calautti, E., Hsiao, J., Ong, L., Topley, G., Turco, E., et al. (1998).

Citron rho-interacting kinase, a novel tissue-specific ser/thr kinase

encompassing the Rho-Rac-binding protein Citron. Journal of Biological

Chemistry, 273(45), 29706-29711.

Di Cunto, F., Imarisio, S., Hirsch, E., Broccoli, V., Bulfone, A., Migheli, A., et al.

(2000). Defective neurogenesis in citron kinase knockout mice by altered

cytokinesis and massive apoptosis. Neuron, 28(1), 115-127.

Dikiy, I., & Eliezer, D. (2012). Folding and misfolding of alpha-synuclein on

membranes. Biochimica et Biophysica Acta (BBA)-Biomembranes, 1818(4),

1013-1018.

Dimas, A. S., Stranger, B. E., Beazley, C., Finn, R. D., Ingle, C. E., Forrest, M. S., et

al. (2008). Modifier effects between regulatory and protein-coding variation.

PLoS genetics, 4(10), e1000244.

Donahue, C. J., Glasser, M. F., Preuss, T. M., Rilling, J. K., & Van Essen, D. C.

(2018). Quantitative assessment of prefrontal cortex in humans relative to

nonhuman primates. Proceedings of the National Academy of Sciences,

201721653.

Dorus, S., Vallender, E. J., Evans, P. D., Anderson, J. R., Gilbert, S. L., Mahowald,

M., et al. (2004). Accelerated evolution of nervous system genes in the origin

of Homo sapiens. Cell, 119(7), 1027-1040.

Du, H. N., Tang, L., Luo, X. Y., Li, H. T., Hu, J., Zhou, J. W., et al. (2003). A peptide

motif consisting of glycine, alanine, and valine is required for the fibrillization

and cytotoxicity of human α-synuclein. Biochemistry, 42(29), 8870-8878.

Duplan, E., Giordano, C., Checler, F., & Alves da Costa, C. (2016). Direct α-synuclein

promoter transactivation by the tumor suppressor p53. Molecular

Neurodegeneration, 11(1).

Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and

high throughput. Nucleic Acids Research, 32(5), 1792-1797.

Ericson, K. K., Krull, D., Slomiany, P., & Grossel, M. J. (2003). Expression of Cyclin-

Dependent Kinase 6, but not Cyclin-Dependent Kinase 4, Alters Morphology

Page 138: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

118

of Cultured Mouse Astrocytes11NSF under CAREER grant# 9984454 to

Martha J. Grossel. Molecular Cancer Research, 1(9), 654-664.

Evans, P. D., Anderson, J. R., Vallender, E. J., Gilbert, S. L., Malcom, C. M., Dorus,

S., et al. (2004). Adaptive evolution of ASPM, a major determinant of cerebral

cortical size in humans. Human molecular genetics, 13(5), 489-494.

Evans, P. D., Gilbert, S. L., Mekel-Bobrov, N., Vallender, E. J., Anderson, J. R., Vaez-

Azizi, L. M., et al. (2005). Microcephalin, a gene regulating brain size,

continues to evolve adaptively in humans. Science, 309(5741), 1717-1720.

Evans, P. D., Vallender, E. J., & Lahn, B. T. (2006). Molecular evolution of the brain

size regulator genes CDK5RAP2 and CENPJ. Gene, 375, 75-79.

Faheem, M., Naseer, M. I., Rasool, M., Chaudhary, A. G., Kumosani, T. A., Ilyas, A.

M., et al. (2015). Molecular genetics of human primary microcephaly: an

overview. BMC medical genomics, 8(1), S4.

Felsenstein, J. (1985). Confidence limits on phylogenies: an approach using the

bootstrap. Evolution, 39(4), 783-791.

Fietz, S. A., Lachmann, R., Brandl, H., Kircher, M., Samusik, N., Schröder, R., et al.

(2012). Transcriptomes of germinal zones of human and mouse fetal neocortex

suggest a role of extracellular matrix in progenitor self-renewal. Proceedings of

the National Academy of Sciences, 109(29), 11836-11841.

Filges, I., Nosova, E., Bruder, E., Tercanli, S., Townsend, K., Gibson, W., et al.

(2014). Exome sequencing identifies mutations in KIF14 as a novel cause of an

autosomal recessive lethal fetal ciliopathy phenotype. Clinical genetics, 86(3),

220-228.

Finlay, B. L., & Darlington, R. B. (1995). Linked regularities in the development and

evolution of mammalian brains. Science, 268(5217), 1578-1584.

Fish, J. L., Kosodo, Y., Enard, W., Pääbo, S., & Huttner, W. B. (2006). Aspm

specifically maintains symmetric proliferative divisions of neuroepithelial

cells. Proceedings of the National Academy of Sciences, 103(27), 10438-

10443.

Fletcher, W., & Yang, Z. (2010). The effect of insertions, deletions, and alignment

errors on the branch-site test of positive selection. Molecular biology and

evolution, 27(10), 2257-2267.

Florio, M., Albert, M., Taverna, E., Namba, T., Brandl, H., Lewitus, E., et al. (2015).

Human-specific gene ARHGAP11B promotes basal progenitor amplification

and neocortex expansion. Science, 347(6229), 1465-1470.

Florio, M., Borrell, V., & Huttner, W. B. (2017). Human-specific genomic signatures

of neocortical expansion. Current opinion in neurobiology, 42, 33-44.

Florio, M., Namba, T., Pääbo, S., Hiller, M., & Huttner, W. B. (2016). A single splice

site mutation in human-specific ARHGAP11B causes basal progenitor

amplification. Science advances, 2(12), e1601941.

Fu, Y. X., & Li, W. H. (1993). Statistical tests of neutrality of mutations. Genetics,

133(3), 693-709.

Fujikura, K., Setsu, T., Tanigaki, K., Abe, T., Kiyonari, H., Terashima, T., et al.

(2013). Kif14 mutation causes severe brain malformation and

hypomyelination. PloS one, 8(1), e53490.

Gagneux, P., & Varki, A. (2001). Genetic differences between humans and great apes.

Molecular Phylogenetics and Evolution, 18(1), 2-13.

Gai, M., Bianchi, F. T., Vagnoni, C., Vernì, F., Bonaccorsi, S., Pasquero, S., et al.

(2016). ASPM and CITK regulate spindle orientation by affecting the

dynamics of astral microtubules. EMBO reports, e201541823.

Page 139: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

119

Gao, H.-M., & Hong, J.-S. (2008). Why neurodegenerative diseases are progressive:

uncontrolled inflammation drives disease progression. Trends in immunology,

29(8), 357-365.

Genin, A., Desir, J., Lambert, N., Biervliet, M., Van Der Aa, N., Pierquin, G., et al.

(2012). Kinetochore KMN network gene CASC5 mutated in primary

microcephaly. Human molecular genetics, 21(24), 5306-5317.

George, J. M. (2002). The synucleins. Genome Biol, 3(1), 3002.3001-3002.3006.

Geschwind, D. H., & Rakic, P. (2013). Cortical evolution: judge the brain by its cover.

Neuron, 80(3), 633-647.

Ghika, J. (2008). Paleoneurology: neurodegenerative diseases are age-related diseases

of specific brain regions recently developed by homo sapiens. Medical

hypotheses, 71(5), 788-801.

Goldman, N., & Yang, Z. (1994). A codon-based model of nucleotide substitution for

protein-coding DNA sequences. Molecular biology and evolution, 11(5), 725-

736.

Gonzalez, C., Saunders, R., Casal, J., Molina, I., Carmena, M., Ripoll, P., et al. (1990).

Mutations at the asp locus of Drosophila lead to multiple free centrosomes in

syncytial embryos, but restrict centrosome duplication in larval neuroblasts.

Journal of Cell Science, 96(4), 605-616.

Grantham, R. (1974). Amino acid difference formula to help explain protein evolution.

Science, 185(4154), 862-864.

Greten-Harrison, B., Polydoro, M., Morimoto-Tomita, M., Diao, L., Williams, A. M.,

Nie, E. H., et al. (2010). αβγ-Synuclein triple knockout mice reveal age-

dependent neuronal dysfunction. Proceedings of the National Academy of

Sciences, 107(45), 19573-19578.

Groussin, M., Hobbs, J. K., Szöllősi, G. J., Gribaldo, S., Arcus, V. L., & Gouy, M.

(2014). Toward more accurate ancestral protein genotype–phenotype

reconstructions with the use of species tree-aware gene trees. Molecular

biology and evolution, 32(1), 13-22.

Gruneberg, U., Neef, R., Li, X., Chan, E. H., Chalamalasetty, R. B., Nigg, E. A., et al.

(2006). KIF14 and citron kinase act together to promote efficient cytokinesis.

The Journal of cell biology, 172(3), 363-372.

Gu, X. (2001). Maximum-likelihood approach for gene family evolution under

functional divergence. Molecular biology and evolution, 18(4), 453-464.

Gu, X., & Vander Velden, K. (2002). DIVERGE: phylogeny-based analysis for

functional–structural divergence of a protein family. Bioinformatics, 18(3),

500-501.

Guemez-Gamboa, A., Nguyen, L. N., Yang, H., Zaki, M. S., Kara, M., Ben-Omran, T.,

et al. (2015). Inactivating mutations in MFSD2A, required for omega-3 fatty

acid transport in brain, cause a lethal microcephaly syndrome. Nature genetics,

47(7), 809.

Guernsey, D. L., Jiang, H., Hussin, J., Arnold, M., Bouyakdan, K., Perry, S., et al.

(2010). Mutations in centrosomal protein CEP152 in primary microcephaly

families linked to MCPH4. The American Journal of Human Genetics, 87(1),

40-51.

Gul, A., Hassan, M. J., Hussain, S., Raza, S. I., Chishti, M. S., & Ahmad, W. (2006).

A novel deletion mutation in CENPJ gene in a Pakistani family with autosomal

recessive primary microcephaly. Journal of human genetics, 51(9), 760-764.

Gunz, P., Neubauer, S., Golovanova, L., Doronichev, V., Maureille, B., & Hublin, J.-J.

(2012). A uniquely modern human pattern of endocranial development.

Page 140: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

120

Insights from a new cranial reconstruction of the Neandertal newborn from

Mezmaiskaya. Journal of human evolution, 62(2), 300-313.

Hald, A., & Lotharius, J. (2005). Oxidative stress and inflammation in Parkinson's

disease: is there a causal link? Experimental neurology, 193(2), 279-290.

Hashimoto, M., & Masliah, E. (1999). Alpha synuclein in Lewy Body Disease and

Alzheimer's Disease. Brain pathology, 9(4), 707-720.

Hashimoto, N., Brock, H., Nomura, M., Kyba, M., Hodgson, J., Fujita, Y., et al.

(1998). RAE28, BMI1, and M33 are members of heterogeneous multimeric

mammalian Polycomb group complexes. Biochemical and biophysical

research communications, 245(2), 356-365.

Hubbard, T., Barker, D., Birney, E., Cameron, G., Chen, Y., Clark, L., et al. (2002).

The Ensembl genome database project. Nucleic Acids Research, 30(1), 38-41.

Hung, L.-Y., Chen, H.-L., Chang, C.-W., Li, B.-R., & Tang, T. K. (2004).

Identification of a novel microtubule-destabilizing motif in CPAP that binds to

tubulin heterodimers and inhibits microtubule assembly. Molecular biology of

the cell, 15(6), 2697-2706.

Hussain, M. S., Baig, S. M., Neumann, S., Nürnberg, G., Farooq, M., Ahmad, I., et al.

(2012). A truncating mutation of CEP135 causes primary microcephaly and

disturbed centrosomal function. The American Journal of Human Genetics,

90(5), 871-878.

Hussain, M. S., Baig, S. M., Neumann, S., Peche, V. S., Szczepanski, S., Nürnberg, G.,

et al. (2013). CDK6 associates with the centrosome during mitosis and is

mutated in a large Pakistani family with primary microcephaly. Human

molecular genetics, 22(25), 5199-5214.

Huyton, T., Bates, P. A., Zhang, X., Sternberg, M. J., & Freemont, P. S. (2000). The

BRCA1 C-terminal domain: structure and function. Mutation Research/DNA

Repair, 460(3-4), 319-332.

International HapMap, C., Altshuler, D. M., Gibbs, R. A., Peltonen, L., Altshuler, D.

M., Gibbs, R. A., et al. (2010). Integrating common and rare genetic variation

in diverse human populations. Nature, 467(7311), 52-58.

Isono, K.-i., Fujimura, Y.-i., Shinga, J., Yamaki, M., Jiyang, O., Takihara, Y., et al.

(2005). Mammalian polyhomeotic homologues Phc2 and Phc1 act in synergy

to mediate polycomb repression of Hox genes. Molecular and cellular biology,

25(15), 6694-6706.

Izraeli, S., & Colaizzo-Anas, T. (1997). Expression of the SIL gene is correlated with

growth induction and cellular proliferation. leukemia, 3, 4.

Izraeli, S., Lowe, L. A., Bertness, V. L., Good, D. J., Dorward, D. W., Kirsch, I. R., et

al. (1999). The SIL gene is required for mouse embryonic axial development

and left–right specification. Nature, 399(6737), 691.

Jackson, A. P., Eastwood, H., Bell, S. M., Adu, J., Toomes, C., Carr, I. M., et al.

(2002). Identification of microcephalin, a protein implicated in determining the

size of the human brain. The American Journal of Human Genetics, 71(1), 136-

142.

Jamieson, C. R., Govaerts, C., & Abramowicz, M. J. (1999). Primary autosomal

recessive microcephaly: homozygosity mapping of MCPH4 to chromosome 15.

American journal of human genetics, 65(5), 1465.

Jayaraman, D., Bae, B.-I., & Walsh, C. A. (2018). The Genetics of Primary

Microcephaly. Annual review of genomics and human genetics(0).

Jayaraman, D., Kodani, A., Gonzalez, D. M., Mancias, J. D., Mochida, G. H.,

Vagnoni, C., et al. (2016). Microcephaly proteins Wdr62 and Aspm define a

Page 141: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

121

mother centriole complex regulating centriole biogenesis, apical complex, and

cell fate. Neuron, 92(4), 813-828.

Jin, H., Kanthasamy, A., Ghosh, A., Yang, Y., Anantharam, V., & Kanthasamy, A. G.

(2011). -Synuclein Negatively Regulates Protein Kinase C Expression to

Suppress Apoptosis in Dopaminergic Neurons by Reducing p300 Histone

Acetyltransferase Activity. Journal of Neuroscience, 31(6), 2035-2051.

Jones, D. T., Taylor, W. R., & Thornton, J. M. (1992). The rapid generation of

mutation data matrices from protein sequences. Bioinformatics, 8(3), 275-282.

Jouan, L., Bencheikh, B. O. A., Daoud, H., Dionne-Laporte, A., Dobrzeniecka, S.,

Spiegelman, D., et al. (2016). Exome sequencing identifies recessive

CDK5RAP2 variants in patients with isolated agenesis of corpus callosum.

European Journal of Human Genetics, 24(4), 607.

Kaessmann, H. (2010). Origins, evolution, and phenotypic impact of new genes.

Genome research, 20(10), 1313-1326.

Kaindl, A. M., Passemard, S., Kumar, P., Kraemer, N., Issa, L., Zwirner, A., et al.

(2010). Many roads lead to primary autosomal recessive microcephaly.

Progress in neurobiology, 90(3), 363-383.

Kakar, N., Ahmad, J., Morris-Rosendahl, D. J., Altmüller, J., Friedrich, K., Barbi, G.,

et al. (2015). STIL mutation causes autosomal recessive microcephalic lobar

holoprosencephaly. Human genetics, 134(1), 45-51.

Kalay, E., Yigit, G., Aslan, Y., Brown, K. E., Pohl, E., Bicknell, L. S., et al. (2011).

CEP152 is a genome maintenance protein disrupted in Seckel syndrome.

Nature genetics, 43(1), 23.

Kaul, S., Anantharam, V., Kanthasamy, A., & Kanthasamy, A. G. (2005). Wild-type

α-synuclein interacts with pro-apoptotic proteins PKCδ and BAD to protect

dopaminergic neuronal cells against MPP+-induced apoptotic cell death.

Molecular Brain Research, 139(1), 137-152.

Khan, M. A., Rupp, V. M., Orpinell, M., Hussain, M. S., Altmüller, J., Steinmetz, M.

O., et al. (2014). A missense mutation in the PISA domain of HsSAS-6 causes

autosomal recessive primary microcephaly in a large consanguineous Pakistani

family. Human molecular genetics, 23(22), 5940-5949.

Kim, D. S., & Hahn, Y. (2011). Identification of novel phosphorylation modification

sites in human proteins that originated after the human–chimpanzee

divergence. Bioinformatics, 27(18), 2494-2501.

Kim, K., Lee, S., Chang, J., & Rhee, K. (2008). A novel function of CEP135 as a

platform protein of C-NAP1 for its centriolar localization. Experimental cell

research, 314(20), 3692-3700.

Kirkham, M., Müller-Reichert, T., Oegema, K., Grill, S., & Hyman, A. A. (2003).

SAS-4 is a C. elegans centriolar protein that controls centrosome size. Cell,

112(4), 575-587.

Kiyomitsu, T., Obuse, C., & Yanagida, M. (2007). Human Blinkin/AF15q14 is

required for chromosome alignment and the mitotic checkpoint through direct

interaction with Bub1 and BubR1. Developmental cell, 13(5), 663-676.

Kochiyama, T., Ogihara, N., Tanabe, H. C., Kondo, O., Amano, H., Hasegawa, K., et

al. (2018). Reconstructing the Neanderthal brain using computational anatomy.

Scientific Reports, 8(1), 6296.

Kokhan, V. S., Van‘kin, G. I., Bachurin, S. O., & Shamakina, I. Y. (2013). Differential

involvement of the gamma-synuclein in cognitive abilities on the model of

knockout mice. BMC Neuroscience, 14(1), 1.

Page 142: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

122

Kontopoulos, E., Parvin, J. D., & Feany, M. B. (2006). α-synuclein acts in the nucleus

to inhibit histone acetylation and promote neurotoxicity. Human Molecular

Genetics, 15(20), 3012-3023.

Kouprina, N., Pavlicek, A., Collins, N. K., Nakano, M., Noskov, V. N., Ohzeki, J.-I.,

et al. (2005). The microcephaly ASPM gene is expressed in proliferating

tissues and encodes for a mitotic spindle protein. Human molecular genetics,

14(15), 2155-2165.

Kousar, R., Hassan, M. J., Khan, B., Basit, S., Mahmood, S., Mir, A., et al. (2011).

Mutations in WDR62 gene in Pakistani families with autosomal recessive

primary microcephaly. Bmc Neurology, 11.

Kriegstein, A., Noctor, S., & Martínez-Cerdeño, V. (2006). Patterns of neural stem and

progenitor cell division may underlie evolutionary cortical expansion. Nature

Reviews Neuroscience, 7(11), 883.

Kruger, R., Kuhn, W., Muller, T., Woitalla, D., Graeber, M., Kosel, S., et al. (1998).

Ala30Pro mutation in the gene encoding α-synuclein in Parkinson's disease.

[10.1038/ng0298-106]. Nat Genet, 18(2), 106-108.

Kumar, A., Girimaji, S. C., Duvvari, M. R., & Blanton, S. H. (2009). Mutations in

STIL, encoding a pericentriolar and centrosomal protein, cause primary

microcephaly. The American Journal of Human Genetics, 84(2), 286-290.

Lanzillotta, A., Porrini, V., Bellucci, A., Benarese, M., Branca, C., Parrella, E., et al.

(2015). NF-κB in Innate Neuroprotection and Age-Related Neurodegenerative

Diseases. Frontiers in Neurology, 6.

Lashuel, H. A., Overk, C. R., Oueslati, A., & Masliah, E. (2012). The many faces of α-

synuclein: from structure and toxicity to therapeutic target. Nature Reviews

Neuroscience, 14(1), 38-48.

Lavedan, C., Leroy, E., Dehejia, A., Buchholtz, S., Dutra, A., Nussbaum, R. L., et al.

(1998). Identification, localization and characterization of the human γ-

synuclein gene. Human genetics, 103(1), 106-112.

Lee, H.-J., Baek, S. M., Ho, D.-H., Suk, J.-E., Cho, E.-D., & Lee, S.-J. (2011).

Dopamine promotes formation and secretion of non-fibrillar alpha-synuclein

oligomers. Experimental and Molecular Medicine, 43(4), 216.

Leidel, S., Delattre, M., Cerutti, L., Baumer, K., & Gönczy, P. (2005). SAS-6 defines a

protein family required for centrosome duplication in C. elegans and in human

cells. Nature cell biology, 7(2), 115.

Leidel, S., & Gönczy, P. (2005). Centrosome duplication and nematodes: recent

insights from an old relationship. Developmental cell, 9(3), 317-325.

Leong, S. L., Hinds, M. G., Connor, A. R., Smith, D. P., Toth, E. I., Pham, C., et al.

(2015). The N-Terminal Residues 43 to 60 Form the Interface for Dopamine

Mediated α-Synuclein Dimerisation. PloS one, 10(2), e0116497.

Lesage, S., Anheim, M., Letournel, F., Bousset, L., Honoré, A., Rozas, N., et al.

(2013). G51D α synuclein mutation causes a novel Parkinsonian–pyramidal

syndrome. Annals of neurology, 73(4), 459-471.

Létard, P., Drunat, S., Vial, Y., Duerinckx, S., Ernault, A., Amram, D., et al. (2018).

Autosomal recessive primary microcephaly due to ASPM mutations: An

update. Human mutation, 39(3), 319-332.

Li, H., Bielas, S. L., Zaki, M. S., Ismail, S., Farfara, D., Um, K., et al. (2016). Biallelic

mutations in citron kinase link mitotic cytokinesis to human primary

microcephaly. The American Journal of Human Genetics, 99(2), 501-510.

Li, J.-Y., Jensen, P. H., & Dahlström, A. (2002). Differential localization of α-, β-and

γ-synucleins in the rat CNS. Neuroscience, 113(2), 463-478.

Page 143: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

123

Li, W. H. (1993). Unbiased Estimation of the Rates of Synonymous and

Nonsynonymous Substitution. Journal of Molecular Evolution, 36(1), 96-99.

Liang, H., Zhou, W., & Landweber, L. F. (2006). SWAKK: a web server for detecting

positive selection in proteins using a sliding window substitution rate analysis.

Nucleic Acids Research, 34(Web Server issue), W382-384.

Librado, P., & Rozas, J. (2009). DnaSP v5: a software for comprehensive analysis of

DNA polymorphism data. Bioinformatics, 25(11), 1451-1452.

Lin, Y. C., Chang, C. W., Hsu, W. B., Tang, C. J. C., Lin, Y. N., Chou, E. J., et al.

(2013). Human microcephaly protein CEP135 binds to hSAS‐6 and CPAP, and

is required for centriole assembly. The EMBO journal, 32(8), 1141-1154.

Lizarraga, S. B., Margossian, S. P., Harris, M. H., Campagna, D. R., Han, A.-P.,

Blevins, S., et al. (2010). Cdk5rap2 regulates centrosome function and

chromosome segregation in neuronal progenitors. Development, 137(11), 1907-

1917.

Löytynoja, A. (2014). Phylogeny-aware alignment with PRANK Multiple sequence

alignment methods (pp. 155-170): Springer.

Löytynoja, A., & Goldman, N. (2008). Phylogeny-aware gap placement prevents

errors in sequence alignment and evolutionary analysis. Science, 320(5883),

1632-1635.

Lücking, C., & Brice, A. (2000). Alpha-synuclein and Parkinson's disease. Cellular

and Molecular Life Sciences CMLS, 57(13-14), 1894-1908.

Mahony, D., Parry, D. A., & Lees, E. (1998). Active cdk6 complexes are

predominantly nuclear and represent only a minority of the cdk6 in T cells.

Oncogene, 16(5), 603.

McEvoy, B. P., Powell, J. E., Goddard, M. E., & Visscher, P. M. (2011). Human

population dispersal "Out of Africa'' estimated from linkage disequilibrium and

allele frequencies of SNPs. Genome research, 21(6), 821-829.

McHenry, H. M. (1994). Tempo and mode in human evolution. Proceedings of the

National Academy of Sciences, 91(15), 6780-6786.

Memon, M. M., Raza, S. I., Basit, S., Kousar, R., Ahmad, W., & Ansar, M. (2013). A

novel WDR62 mutation causes primary microcephaly in a Pakistani family.

Mol Biol Rep, 40(1), 591-595.

Moawia, A., Shaheen, R., Rasool, S., Waseem, S. S., Ewida, N., Budde, B., et al.

(2017). Mutations of KIF14 cause primary microcephaly by impairing

cytokinesis. Annals of neurology, 82(4), 562-577.

Mochida, G. H., & Walsh, C. A. (2001). Molecular genetics of human microcephaly.

Curr Opin Neurol, 14(2), 151-156.

Montgomery, S., & Mundy, N. (2012). Positive selection on NIN, a gene involved in

neurogenesis, and primate brain evolution. Genes, Brain and Behavior, 11(8),

903-910.

Montgomery, S. H., & Mundy, N. I. (2014). Microcephaly genes evolved adaptively

throughout the evolution of eutherian mammals. BMC evolutionary biology,

14(1), 120.

Mooney, M. (2009). Role of alpha synuclein in noise induced hearing loss.

Moynihan, L., Jackson, A. P., Roberts, E., Karbani, G., Lewis, I., Corry, P., et al.

(2000). A third novel locus for primary autosomal recessive microcephaly

maps to chromosome 9q34. The American Journal of Human Genetics, 66(2),

724-727.

Murdock, D. R., Clark, G. D., Bainbridge, M. N., Newsham, I., Wu, Y. Q., Muzny, D.

M., et al. (2011). Whole-Exome Sequencing Identifies Compound

Heterozygous Mutations in WDR62 in Siblings With Recurrent

Page 144: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

124

Polymicrogyria. American journal of medical genetics Part A, 155A(9), 2071-

2077.

Nakazawa, Y., Hiraki, M., Kamiya, R., & Hirono, M. (2007). SAS-6 is a cartwheel

protein that establishes the 9-fold symmetry of the centriole. Current Biology,

17(24), 2169-2174.

Neubauer, S., Hublin, J.-J., & Gunz, P. (2018). The evolution of modern human brain

shape. Science advances, 4(1), eaao5961.

Nguyen, L. N., Ma, D., Shui, G., Wong, P., Cazenave-Gassiot, A., Zhang, X., et al.

(2014). Mfsd2a is a transporter for the essential omega-3 fatty acid

docosahexaenoic acid. Nature, 509(7501), 503.

Nicholas, A. K., Khurshid, M., Désir, J., Carvalho, O. P., Cox, J. J., Thornton, G., et

al. (2010). WDR62 is associated with the spindle pole and is mutated in human

microcephaly. Nature genetics, 42(11), 1010.

Nicholas, A. K., Khurshid, M., Desir, J., Carvalho, O. P., Cox, J. J., Thornton, G., et

al. (2010). WDR62 is associated with the spindle pole and is mutated in human

microcephaly. Nature genetics, 42(11), 1010-U1138.

Nielsen, M. S., Vorum, H., Lindersson, E., & Jensen, P. H. (2001). Ca2+ binding to α-

synuclein regulates ligand binding and oligomerization. Journal of Biological

Chemistry, 276(25), 22680-22684.

Nielsen, R., & Yang, Z. (1998). Likelihood models for detecting positively selected

amino acid sites and applications to the HIV-1 envelope gene. Genetics,

148(3), 929-936.

Ninkina, N., Papachroni, K., Robertson, D. C., Schmidt, O., Delaney, L., O'Neill, F., et

al. (2003). Neurons expressing the highest levels of γ-synuclein are unaffected

by targeted inactivation of the gene. Molecular and cellular biology, 23(22),

8233-8245.

Ninkina, N., Peters, O. M., Connor-Robson, N., Lytkina, O., Sharfeddin, E., &

Buchman, V. L. (2012). Contrasting effects of α-synuclein and γ-synuclein on

the phenotype of cysteine string protein α (CSPα) null mutant mice suggest

distinct function of these proteins in neuronal synapses. Journal of Biological

Chemistry, 287(53), 44471-44477.

Nonaka‐Kinoshita, M., Reillo, I., Artegiani, B., Martínez‐Martínez, M. Á., Nelson, M.,

Borrell, V., et al. (2013). Regulation of cerebral cortex size and folding by

expansion of basal progenitors. The EMBO journal, 32(13), 1817-1828.

O'Bleness, M., Searles, V. B., Varki, A., Gagneux, P., & Sikela, J. M. (2012).

Evolution of genetic and genomic features unique to the human lineage. Nature

Reviews Genetics, 13(12), 853-866.

Ohta, T., Essner, R., Ryu, J.-H., Palazzo, R. E., Uetake, Y., & Kuriyama, R. (2002).

Characterization of Cep135, a novel coiled-coil centrosomal protein involved

in microtubule organization in mammalian cells. J Cell Biol, 156(1), 87-100.

Olson, M. V., & Varki, A. (2003). Sequencing the chimpanzee genome: insights into

human evolution and disease. Nat Rev Genet, 4(1), 20-28.

Pan, Z.-Z., Bruening, W., Giasson, B. I., Lee, V. M.-Y., & Godwin, A. K. (2002). γ-

Synuclein promotes cancer cell survival and inhibits stress-and chemotherapy

drug-induced apoptosis by modulating MAPK pathways. Journal of Biological

Chemistry, 277(38), 35050-35060.

Panda, A., Begum, T., & Ghosh, T. C. (2012). Insights into the evolutionary features

of human neurodegenerative diseases. PloS one, 7(10), e48336.

Paramasivam, M., Chang, Y., & LoTurco, J. J. (2007). ASPM and citron kinase co-

localize to the midbody ring during cytokinesis. Cell cycle, 6(13), 1605-1612.

Page 145: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

125

Park, J. S., Lee, M.-K., Kang, S., Jin, Y., Fu, S., Rosales, J. L., et al. (2015). Species-

specific expression of full-length and alternatively spliced variant forms of

CDK5RAP2. PloS one, 10(11), e0142577.

Pattison, L., Crow, Y. J., Deeble, V. J., Jackson, A. P., Jafri, H., Rashid, Y., et al.

(2000). A fifth locus for primary autosomal recessive microcephaly maps to

chromosome 1q31. The American Journal of Human Genetics, 67(6), 1578-

1580.

Pervaiz, N., & Abbasi, A. A. (2016). Molecular evolution of WDR62, a gene that

regulates neocorticogenesis. Meta gene, 9, 1-9.

Pettersen, E. F., Goddard, T. D., Huang, C. C., Couch, G. S., Greenblatt, D. M., Meng,

E. C., et al. (2004). UCSF Chimera—a visualization system for exploratory

research and analysis. Journal of computational chemistry, 25(13), 1605-1612.

Pfaff, K. L., Straub, C. T., Chiang, K., Bear, D. M., Zhou, Y., & Zon, L. I. (2007). The

zebra fish cassiopeia mutant reveals that SIL is required for mitotic spindle

organization. Molecular and cellular biology, 27(16), 5887-5897.

Polymeropoulos, M. H., Lavedan, C., Leroy, E., Ide, S. E., Dehejia, A., Dutra, A., et

al. (1997). Mutation in the α-synuclein gene identified in families with

Parkinson's disease. science, 276(5321), 2045-2047.

Preuss, T. M. (2011). The human brain: rewired and running hot. Annals of the New

York Academy of Sciences, 1225(1).

Preuss, T. M. (2012). Human brain evolution: from gene discovery to phenotype

discovery. Proc Natl Acad Sci U S A, 109 Suppl 1, 10709-10716.

Prüfer, K., Racimo, F., Patterson, N., Jay, F., Sankararaman, S., Sawyer, S., et al.

(2014). The complete genome sequence of a Neanderthal from the Altai

Mountains. Nature, 505(7481), 43.

Pruitt, K. D., Tatusova, T., & Maglott, D. R. (2007). NCBI reference sequences

(RefSeq): a curated non-redundant sequence database of genomes, transcripts

and proteins. Nucleic Acids Research, 35, D61-D65.

Pulvers, J. N., Bryk, J., Fish, J. L., Wilsch-Bräuninger, M., Arai, Y., Schreier, D., et al.

(2010). Mutations in mouse Aspm (abnormal spindle-like microcephaly

associated) cause not only microcephaly but also major defects in the germline.

Proceedings of the National Academy of Sciences, 107(38), 16595-16600.

Rakic, P. (1988). Specification of cerebral cortical areas. Science, 241(4862), 170-176.

Rakic, P. (1995). A small step for the cell, a giant leap for mankind: a hypothesis of

neocortical expansion during evolution. Trends in neurosciences, 18(9), 383-

388.

Rakic, P. (2000). Radial unit hypothesis of neocortical expansion. Paper presented at

the Novartis Foundation Symposium.

Reich, D., Green, R. E., Kircher, M., Krause, J., Patterson, N., Durand, E. Y., et al.

(2010). Genetic history of an archaic hominin group from Denisova Cave in

Siberia. Nature, 468(7327), 1053.

Rhoads, A., & Kenguele, H. (2005). Expression of IQ-motif genes in human cells and

ASPM domain structure. ETHNICITY AND DISEASE, 15(4), S5.

Richardson, J., Shaaban, A. M., Kamal, M., Alisary, R., Walker, C., Ellis, I. O., et al.

(2011). Microcephalin is a new novel prognostic indicator in breast cancer

associated with BRCA1 inactivation. Breast cancer research and treatment,

127(3), 639-648.

Rightmire, G. P. (2004). Brain size and encephalization in early to mid‐Pleistocene

Homo. American Journal of Physical Anthropology: The Official Publication

of the American Association of Physical Anthropologists, 124(2), 109-123.

Page 146: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

126

Rightmire, G. P. (2013). Homo erectus and Middle Pleistocene hominins: brain size,

skull form, and species recognition. Journal of human evolution, 65(3), 223-

252.

Rodrigues-Martins, A., Bettencourt-Dias, M., Riparbelli, M., Ferreira, C., Ferreira, I.,

Callaini, G., et al. (2007). DSAS-6 organizes a tube-like centriole precursor,

and its absence suggests modularity in centriole assembly. Current Biology,

17(17), 1465-1472.

Rodrigues-Martins, A., Riparbelli, M., Callaini, G., Glover, D. M., & Bettencourt-

Dias, M. (2008). From centriole biogenesis to cellular function: centrioles are

essential for cell division at critical developmental stages. Cell cycle, 7(1), 11-

16.

Russo, A. A., Tong, L., Lee, J.-O., Jeffrey, P. D., & Pavletich, N. P. (1998). Structural

basis for inhibition of the cyclin-dependent kinase Cdk6 by the tumour

suppressor p16 INK4a. Nature, 395(6699), 237.

Saadi, A., Borck, G., Boddaert, N., Chekkour, M. C., Imessaoudene, B., Munnich, A.,

et al. (2009). Compound heterozygous ASPM mutations associated with

microcephaly and simplified cortical gyration in a consanguineous Algerian

family. European journal of medical genetics, 52(4), 180-184.

Sachidanandam, R., Weissman, D., Schmidt, S. C., Kakol, J. M., Stein, L. D., Marth,

G., et al. (2001). A map of human genome sequence variation containing 1.42

million single nucleotide polymorphisms. Nature, 409(6822), 928-933.

Saitou, N., & Nei, M. (1987). The Neighbor-Joining Method - a New Method for

Reconstructing Phylogenetic Trees. Molecular biology and evolution, 4(4),

406-425.

Sarkisian, M. R., Li, W., Di Cunto, F., D'Mello, S. R., & LoTurco, J. J. (2002). Citron-

Kinase, A Protein Essential to Cytokinesis in Neuronal Progenitors, Is Deleted

in the FlatheadMutant Rat. Journal of Neuroscience, 22(8), RC217-RC217.

Sarkisian, M. R., Rattan, S., D'Mello, S. R., & LoTurco, J. J. (1999). Characterization

of seizures in the flathead rat: a new genetic model of epilepsy in early

postnatal development. Epilepsia, 40(4), 394-400.

Sato, R., Takanashi, J.-i., Tsuyusaki, Y., Kato, M., Saitsu, H., Matsumoto, N., et al.

(2016). Association between invisible basal ganglia and ZNF335 mutations: a

case report. Pediatrics, e20160897.

Schiebel, E. (2000). γ-tubulin complexes: binding to the centrosome, regulation and

microtubule nucleation. Current opinion in cell biology, 12(1), 113-118.

Schoenemann, P. T., Sheehan, M. J., & Glotzer, L. D. (2005). Prefrontal white matter

volume is disproportionately larger in humans than in other primates. Nature

neuroscience, 8(2), 242.

Segura-Ulate, I., Yang, B., Vargas-Medrano, J., & Perez, R. G. (2017). FTY720

(Fingolimod) reverses α-synuclein-induced downregulation of brain-derived

neurotrophic factor mRNA in OLN-93 oligodendroglial cells.

Neuropharmacology, 117, 149-157.

Semendeferi, K., & Damasio, H. (2000). The brain and its main anatomical

subdivisions in living hominoids using magnetic resonance imaging. Journal of

human evolution, 38(2), 317-332.

Shaheen, R., Hashem, A., Abdel-Salam, G. M., Al-Fadhli, F., Ewida, N., & Alkuraya,

F. S. (2016). Mutations in CIT, encoding citron rho-interacting serine/threonine

kinase, cause severe primary microcephaly in humans. Human genetics,

135(10), 1191-1197.

Sheik, S., Sundararajan, P., Hussain, A., & Sekar, K. (2002). Ramachandran plot on

the web. Bioinformatics, 18(11), 1548-1549.

Page 147: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

127

Sherry, S. T., Ward, M. H., Kholodov, M., Baker, J., Phan, L., Smigielski, E. M., et al.

(2001). dbSNP: the NCBI database of genetic variation. Nucleic Acids

Research, 29(1), 308-311.

Siddiqui, I. J., Pervaiz, N., & Abbasi, A. A. (2016). The Parkinson Disease gene

SNCA: Evolutionary and structural insights with pathological implication.

Scientific Reports, 6, 24475.

Sidhu, A., Wersinger, C., MOUSSA, C. E. H., & Vernier, P. (2004). The role of α‐

synuclein in both neuroprotection and neurodegeneration. Annals of the New

York Academy of Sciences, 1035(1), 250-270.

Silke, A. C., Carles, V. G., Encarnacion, M., Sherman, H., Yu, I., Shah, B., et al.

(2013). Alpha synuclein p. H50Q, a novel pathogenic mutation for Parkinson's

disease. Movement Disorders, 28(6), 811-813.

Simonetti, F. L., Teppa, E., Chernomoretz, A., Nielsen, M., & Marino Buslje, C.

(2013). MISTIC: mutual information server to infer coevolution. Nucleic Acids

Research, 41(W1), W8-W14.

Singleton, A., & Hardy, J. (2016). The evolution of genetics: Alzheimer‘s and

Parkinson‘s diseases. Neuron, 90(6), 1154-1163.

Smith, C. M., Finger, J. H., Hayamizu, T. F., McCright, I. J., Eppig, J. T., Kadin, J. A.,

et al. (2006). The mouse gene expression database (GXD): 2007 update.

Nucleic Acids Research, 35(suppl_1), D618-D623.

Smith, W. W. (2005). Endoplasmic reticulum stress and mitochondrial cell death

pathways mediate A53T mutant alpha-synuclein-induced toxicity. Human

Molecular Genetics, 14(24), 3801-3811.

Solano, S. M., Miller, D. W., Augood, S. J., Young, A. B., & Penney, J. B. (2000).

Expression of α‐synuclein, parkin, and ubiquitin carboxy‐terminal hydrolase

L1 mRNA in human brain: genes associated with familial Parkinson's disease.

Annals of neurology, 47(2), 201-210.

Spillantini, M. G., Divane, A., & Goedert, M. (1995). Assignment of human α-

synuclein (SNCA) and β-synuclein (SNCB) genes to chromosomes 4q21 and

5q35. Genomics, 27(2), 379-381.

Stancik, E. K., Navarro-Quiroga, I., Sellke, R., & Haydar, T. F. (2010). Heterogeneity

in ventricular zone neural precursors contributes to neuronal fate diversity in

the postnatal neocortex. Journal of Neuroscience, 30(20), 7028-7036.

Stephan, H., Frahm, H., & Baron, G. (1981). New and revised data on volumes of

brain structures in insectivores and primates. Folia primatologica, 35(1), 1-29.

Storey, J. D., Taylor, J. E., & Siegmund, D. (2004). Strong control, conservative point

estimation and simultaneous conservative consistency of false discovery rates:

a unified approach. Journal of the Royal Statistical Society: Series B

(Statistical Methodology), 66(1), 187-205.

Storey, J. D., & Tibshirani, R. (2003). Statistical significance for genomewide studies.

Proceedings of the National Academy of Sciences, 100(16), 9440-9445.

Stouffs, K., Stergachis, A., Vanderhasselt, T., Dica, A., Janssens, S., Vandervore, L.,

et al. (2018). Expanding the clinical spectrum of biallelic ZNF335 variants.

Clinical genetics.

Strnad, P., Leidel, S., Vinogradova, T., Euteneuer, U., Khodjakov, A., & Gönczy, P.

(2007). Regulated HsSAS-6 levels ensure formation of a single procentriole per

centriole during the centrosome duplication cycle. Developmental cell, 13(2),

203-213.

Sukumaran, S. K., Stumpf, M., Salamon, S., Ahmad, I., Bhattacharya, K., Fischer, S.,

et al. (2017). CDK5RAP2 interaction with components of the Hippo signaling

Page 148: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

128

pathway may play a role in primary microcephaly. Molecular Genetics and

Genomics, 292(2), 365-383.

Surguchov, A., McMahan, B., Masliah, E., & Surgucheva, I. (2001). Synucleins in

ocular tissues. Journal of neuroscience research, 65(1), 68-77.

Swanson, W. J., Nielsen, R., & Yang, Q. (2003). Pervasive adaptive evolution in

mammalian fertilization proteins. Molecular biology and evolution, 20(1), 18-

20.

Szczepanski, S., Hussain, M. S., Sur, I., Altmüller, J., Thiele, H., Abdullah, U., et al.

(2016). A novel homozygous splicing mutation of CASC5 causes primary

microcephaly in a large Pakistani family. Human genetics, 135(2), 157-170.

Tajima, F. (1989). Statistical method for testing the neutral mutation hypothesis by

DNA polymorphism. Genetics, 123(3), 585-595.

Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., & Kumar, S. (2011).

MEGA5: Molecular Evolutionary Genetics Analysis Using Maximum

Likelihood, Evolutionary Distance, and Maximum Parsimony Methods.

Molecular biology and evolution, 28(10), 2731-2739.

Tang, C. J. C., Lin, S. Y., Hsu, W. B., Lin, Y. N., Wu, C. T., Lin, Y. C., et al. (2011).

The human microcephaly protein STIL interacts with CPAP and is required for

procentriole formation. The EMBO journal, 30(23), 4790-4804.

Team, R. C. (2018). R: A Language and Environment for Statistical Computing.

Teppa, E., Wilkins, A. D., Nielsen, M., & Buslje, C. M. (2012). Disentangling

evolutionary signals: conservation, specificity determining positions and

coevolution. Implication for catalytic residue prediction. BMC bioinformatics,

13(1), 235.

Thomopson, J., Higgins, D. G., & Gibson, T. (1994). ClustalW. Nucleic Acids

Research, 22, 4673-4680.

Thompson, J. D., Higgins, D. G., & Gibson, T. J. (1994). Clustal-W - Improving the

Sensitivity of Progressive Multiple Sequence Alignment through Sequence

Weighting, Position-Specific Gap Penalties and Weight Matrix Choice. Nucleic

Acids Research, 22(22), 4673-4680.

Trimborn, M., Bell, S. M., Felix, C., Rashid, Y., Jafri, H., Griffiths, P. D., et al. (2004).

Mutations in microcephalin cause aberrant regulation of chromosome

condensation. The American Journal of Human Genetics, 75(2), 261-266.

Ulmer, T. S., Bax, A., Cole, N. B., & Nussbaum, R. L. (2005). Structure and dynamics

of micelle-bound human α-synuclein. Journal of Biological Chemistry,

280(10), 9595-9603.

Uverskya, V. N., & Finka, A. L. (2002). Amino acid determinants of alpha synuclein

aggregation: putting together pieces of the puzzle. FEBS Letters, 522, 9-13.

Vallender, E. J., Mekel-Bobrov, N., & Lahn, B. T. (2008). Genetic basis of human

brain evolution. Trends in Neurosciences, 31(12), 637-644.

van Breugel, M., Hirono, M., Andreeva, A., Yanagisawa, H.-a., Yamaguchi, S.,

Nakazawa, Y., et al. (2011). Structures of SAS-6 suggest its organization in

centrioles. Science, 331(6021), 1196-1199.

Vargas, K. J., Makani, S., Davis, T., Westphal, C. H., Castillo, P. E., & Chandra, S. S.

(2014). Synucleins regulate the kinetics of synaptic vesicle endocytosis.

Journal of Neuroscience, 34(28), 9364-9376.

Vargas, K. J., Schrod, N., Davis, T., Fernandez-Busnadiego, R., Taguchi, Y. V.,

Laugks, U., et al. (2017). Synucleins have multiple effects on presynaptic

architecture. Cell reports, 18(1), 161-173.

Page 149: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

129

Vekrellis, K., Xilouri, M., Emmanouilidou, E., Rideout, H. J., & Stefanis, L. (2011).

Pathological roles of α-synuclein in neurological disorders. The Lancet

Neurology, 10(11), 1015-1025.

Venkatesh, B., Lee, A. P., Ravi, V., Maurya, A. K., Lian, M. M., Swann, J. B., et al.

(2014). Elephant shark genome provides unique insights into gnathostome

evolution. Nature, 505(7482), 174.

Wang, Y.-q., Qian, Y.-p., Yang, S., Shi, H., Liao, C.-h., Zheng, H.-K., et al. (2005).

Accelerated evolution of the pituitary adenylate cyclase-activating polypeptide

precursor gene during human origin. Genetics, 170(2), 801-806.

Wang, Y.-q., & Su, B. (2004). Molecular evolution of microcephalin, a gene

determining human brain size. Human molecular genetics, 13(11), 1131-1137.

Wang, Y., Qian, Y., Yang, S., Shi, H., Liao, C., Zheng, H., et al. (2005). Accelerated

evolution of the PACAP precursor gene during human origin. Genetics.

Weadick, C. J., & Chang, B. S. (2011). An improved likelihood ratio test for detecting

site-specific functional divergence among clades of protein-coding genes.

Molecular biology and evolution, 29(5), 1297-1300.

Webb, B., & Sali, A. (2014). Comparative protein structure modeling using Modeller.

Current protocols in bioinformatics.

Whelan, S., & Goldman, N. (2001). A general empirical model of protein evolution

derived from multiple protein families using a maximum-likelihood approach.

Molecular biology and evolution, 18(5), 691-699.

Wollnik, B. (2010). A common mechanism for microcephaly. Nature genetics, 42(11),

923-924.

Wong, W. S., Yang, Z., Goldman, N., & Nielsen, R. (2004). Accuracy and power of

statistical methods for detecting adaptive evolution in protein coding sequences

and for identifying positively selected sites. Genetics, 168(2), 1041-1051.

Woods, C. G., Bond, J., & Enard, W. (2005). Autosomal recessive primary

microcephaly (MCPH): a review of clinical, molecular, and evolutionary

findings. Am J Hum Genet, 76(5), 717-728.

Xu, B., & Yang, Z. (2013). PAMLX: a graphical user interface for PAML. Molecular

biology and evolution, 30(12), 2723-2724.

Xu, S., Sun, X., Niu, X., Zhang, Z., Tian, R., Ren, W., et al. (2017). Genetic basis of

brain size evolution in cetaceans: insights from adaptive evolution of seven

primary microcephaly (MCPH) genes. BMC evolutionary biology, 17(1), 206.

Yang, Y. J., Baltus, A. E., Mathew, R. S., Murphy, E. A., Evrony, G. D., Gonzalez, D.

M., et al. (2012). Microcephaly gene links trithorax and REST/NRSF to control

neural stem cell proliferation and differentiation. Cell, 151(5), 1097-1112.

Yang, Z. (2007). PAML 4: phylogenetic analysis by maximum likelihood. Molecular

biology and evolution, 24(8), 1586-1591.

Yang, Z., Kumar, S., & Nei, M. (1995). A new method of inference of ancestral

nucleotide and amino acid sequences. Genetics, 141(4), 1641-1650.

Yang, Z., Nielsen, R., Goldman, N., & Pedersen, A.-M. K. (2000). Codon-substitution

models for heterogeneous selection pressure at amino acid sites. Genetics,

155(1), 431-449.

Yang, Z., & Swanson, W. J. (2002). Codon-substitution models to detect adaptive

evolution that account for heterogeneous selective pressures among site

classes. Molecular biology and evolution, 19(1), 49-57.

Yang, Z., Wong, W. S., & Nielsen, R. (2005). Bayes empirical Bayes inference of

amino acid sites under positive selection. Molecular biology and evolution,

22(4), 1107-1118.

Page 150: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

References

130

Yu, X., Chini, C. C. S., He, M., Mer, G., & Chen, J. (2003). The BRCT domain is a

phospho-protein binding domain. Science, 302(5645), 639-642.

Zarranz, J. J., Alegre, J., Gómez‐Esteban, J. C., Lezcano, E., Ros, R., Ampuero, I., et

al. (2004). The new mutation, E46K, of α-synuclein causes parkinson and

Lewy body dementia. Annals of neurology, 55(2), 164-173.

Zhang, J. (2000). Rates of conservative and radical nonsynonymous nucleotide

substitutions in mammalian nuclear genes. Journal of Molecular Evolution,

50(1), 56-68.

Zhang, J., Nielsen, R., & Yang, Z. (2005). Evaluation of an improved branch-site

likelihood method for detecting positive selection at the molecular level.

Molecular biology and evolution, 22(12), 2472-2479.

Zhang, X., Liu, D., Lv, S., Wang, H., Zhong, X., Liu, B., et al. (2009). CDK5RAP2 is

required for spindle checkpoint function. Cell cycle, 8(8), 1206-1216.

Zhong, X., Liu, L., Zhao, A., Pfeifer, G. P., & Xu, X. (2005). The abnormal spindle-

like, microcephaly-associated (ASPM) gene encodes a centrosomal protein.

Cell cycle, 4(9), 1227-1229.

Zollikofer, C. P., de León, M. S. P., Lieberman, D. E., Guy, F., Pilbeam, D., Likius,

A., et al. (2005). Virtual cranial reconstruction of Sahelanthropus tchadensis.

Nature, 434(7034), 755.

Zuckerkandl, E., & Pauling, L. (1965). Evolutionary divergence and convergence in

proteins Evolving genes and proteins (pp. 97-166): Elsevier.

Page 151: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,

PUBLICATIONS

Pervaiz, N., & Abbasi, A. A. (2016). Molecular evolution of WDR62, a gene

that regulates neocorticogenesis. Meta gene, 9, 1-9.

Pervaiz, N., Shakeel, N., Qasim, A., Zehra, R., Anwar, S., Rana, N., ... &

Abbasi, A. A. (2019). Evolutionary history of the human multigene families

reveals widespread gene duplications throughout the history of animals. BMC

Evolutionary Biology, 19(1), 128.

Siddiqui, I. J., Pervaiz, N., & Abbasi, A. A. (2016). The Parkinson Disease

gene SNCA: Evolutionary and structural insights with pathological

implication. Scientific reports, 6, 24475.

Seemab, S., Pervaiz, N., Zehra, R., Anwar, S., Bao, Y., & Abbasi, A. A.

(2019). Molecular evolutionary and structural analysis of familial exudative

vitreoretinopathy associated FZD4 gene. BMC evolutionary biology, 19(1), 72.

Ma, L., Cao, J., Liu, L., Li, Z., Shireen, H., Pervaiz, N., ... & Abbasi, A. A.

(2019). Community Curation and Expert Curation of Human Long Noncoding

RNAs with LncRNAWiki and LncBook. Current Protocols in

Bioinformatics, 67(1), e82.

Page 152: Reconstructing the Evolutionary History of MCPH genes and ...prr.hec.gov.pk/jspui/bitstream/123456789/10997/1/Nashaiman Perva… · particular Mr. Talib Hussain, Mr. Yasir Abbasi,