Post on 02-Jul-2018
1
AcontributionofnovelCNVstoschizophreniafromagenome-widestudyof41,321subjects
CNVAnalysisGroupandtheSchizophreniaWorkingGroupofthePsychiatricGenomicsConsortium
ChristianR.Marshall1*,DanielP.Howrigan2,3*,DanieleMerico1*,BhoomaThiruvahindrapuram1,WentingWu4,5,DouglasS.Greer4,5,DannyAntaki4,5,AniketShetty4,5,PeterA.Holmans6,7,DalilaPinto8,9,MadhusudanGujral4,5,WilliamM.Brandler4,5,DheerajMalhotra4,5,10,ZhouzhiWang1,KarinV.FuentesFajarado4,5,StephanRipke2,3,IngridAgartz11,12,13,EsbenAgerbo14,15,16,MargotAlbus17,MadelineAlexander18,FarooqAmin19,20,JoshuaAtkins21,22,SilviuA.Bacanu23,RichardA.BelliveauJr3,SarahE.Bergen3,24,MarceloBertalan16,25,ElizabethBevilacqua3,TimB.Bigdeli23,DonaldW.Black26,RichardBruggeman27,NancyG.Buccola28,RandyL.Buckner29,30,31,BrendanBulik-Sullivan2,3,WilliamByerley32,WiepkeCahn33,GuiqingCai8,34,MurrayJ.Cairns21,35,36,DominiqueCampion37,RitaM.Cantor38,VaughanJ.Carr35,39,NoaCarrera6,StanleyV.Catts35,40,KimberleyD.Chambert3,WeiCheng41,C.RobertCloninger42,DavidCohen43,PaulCormican44,NickCraddock6,7,BenedictoCrespo-Facorro45,46,JamesJ.Crowley47,DavidCurtis48,49,MichaelDavidson50,KennethL,Davis8,FranziskaDegenhardt51,52,JurgenDelFavero53,LynnE.DeLisi54,55,DitteDemontis16,56,57,DimitrisDikeos58,TimothyDinan59,SrdjanDjurovic11,60,GaryDonohoe44,61,ElodieDrapeau8,JubaoDuan62,63,FrankDudbridge64,PeterEichhammer65,JohanEriksson66,67,68,ValentinaEscott-Price6,LaurentEssioux69,AymanH.Fanous70,71,72,73,Kai-HowFarh2,MartiliasS.Farrell47,JosefFrank74,LudeFranke75,RobertFreedman76,NelsonB.Freimer77,JosephI.Friedman8,AndreasJ.Forstner51,52,MenachemFromer2,3,78,79,GiulioGenovese3,LyudmilaGeorgieva6,ElliotS.Gershon80,InaGiegling81,82,PaolaGiusti-Rodríguez47,StephanieGodard83,JacquelineI.Goldstein2,84,JacobGratten85,LieuwedeHaan86,MarianL.Hamshere6,MarkHansen87,ThomasHansen16,25,VahramHaroutunian8,88,89,AnnetteM.Hartmann81,FransA.Henskens35,36,90,StefanHerms51,52,91,JoelN.Hirschhorn84,92,93,PerHoffmann51,52,91,AndreaHofman51,52,MadsV.Hollegaard94,DavidM.Hougaard94,HailiangHuang2,84,MasashiIkeda95,IngeJoa96,AnnaKKähler24,RenéSKahn33,LubaKalaydjieva97,167,JuhaKarjalainen75,DavidKavanagh6,MatthewC.Keller99,BrianJ.Kelly36,JamesL.Kennedy100,101,102,YunjungKim47,JamesA.Knowles103,BettinaKonte81,ClaudineLaurent18,104,PhilLee2,3,79,S.HongLee85,SophieE.Legge6,BernardLerer105,DeborahL.Levy55,106,Kung-YeeLiang107,JeffreyLieberman108,JoukoLönnqvist109,CarmelM.Loughland35,36,PatrikK.E.Magnusson24,BrionS.Maher110,WolfgangMaier111,JacquesMallet112,ManuelMattheisen16,56,57,113,MortenMattingsdal11,114,RobertWMcCarley54,55,ColmMcDonald115,AndrewM.McIntosh116,117,SandraMeier74,CarinJ.Meijer86,IngridMelle11,118,RaquelleI.Mesholam-Gately55,119,AndresMetspalu120,PatriciaT.Michie35,121,LiliMilani120,VihraMilanova122,YounesMokrab123,DerekW.Morris44,61,OleMors16,57,124,BertramMüller-Myhsok125,126,127,KieranC.Murphy128,RobinM.Murray129,InezMyin-Germeys130,IgorNenadic131,DeborahA.Nertney132,GeraldNestadt133,KristinK.Nicodemus134,LauraNisenbaum135,AnnelieNordin136,EadbhardO'Callaghan137,ColmO'Dushlaine3,Sang-YunOh138,AnnOlincy76,LineOlsen16,25,F.AnthonyO'Neill139,JimVanOs130,140,ChristosPantelis35,141,GeorgeN.Papadimitriou58,ElenaParkhomenko8,MicheleT.Pato103,TiinaPaunio142,PsychosisEndophenotypesInternationalConsortium,DianaO.Perkins143,TuneH.Pers84,93,144,OlliPietiläinen142,145,JonathanPimm49,AndrewJ.Pocklington6,JohnPowell129,AlkesPrice84,146,Ann
2
E.Pulver133,ShaunM.Purcell78,DigbyQuested147,HenrikB.Rasmussen16,25,AbrahamReichenberg8,89,MarkA.Reimers23,AlexanderL.Richards6,7,JoshuaL.Roffman30,31,PanosRoussos78,148,DouglasM.Ruderfer6,78,VeikkoSalomaa67,AlanR.Sanders62,63,AdamSavitz149,UlrichSchall35,36,ThomasG.Schulze74,150,SibylleG.Schwab151,EdwardM.Scolnick3,RodneyJ.Scott21,35,152,LarryJ.Seidman55,119,JianxinShi153,JeremyM.Silverman8,154,JordanW.Smoller3,79,ErikSöderman13,ChrisC.A.Spencer155,EliA.Stahl78,84,EricStrengman33,156,JanaStrohmaier74,T.ScottStroup108,JaanaSuvisaari109,DraganM.Svrakic42,JinP.Szatkiewicz47,SrinivasThirumalai157,PaulA.Tooney21,35,36,JuhaVeijola158,159,PeterM.Visscher85,JohnWaddington160,DermotWalsh161,BradleyT.Webb23,MarkWeiser50,DieterB.Wildenauer98,NigelM.Williams6,StephanieWilliams47,StephanieH.Witt74,AaronR.Wolen23,BrandonK.Wormley23,NaomiRWray85,JingQinWu21,35,ClementC.Zai100,101,WellcomeTrustCase-ControlConsortium2,RolfAdolfsson136,OleA.Andreassen11,118,DouglasH.R.Blackwood116,AndersD.Børglum16,56,57,124,ElviraBramon162,JosephD.Buxbaum8,34,89,163,SvenCichon51,52,91,164,DavidA.Collier123,165,AidenCorvin44,MarkJ.Daly2,3,84,ArielDarvasi166,EnricoDomenici10,TõnuEsko84,92,93,120,PabloV.Gejman62,63,MichaelGill44,HughGurling49,ChristinaM.Hultman24,NakaoIwata95,AssenV.Jablensky35,98,167,168,ErikGJönsson11,13,KennethSKendler23,GeorgeKirov6,JoKnight100,101,102,DouglasF.Levinson18,QingqinSLi149,StevenAMcCarroll3,92,AndrewMcQuillin49,JenniferL.Moran3,PrebenB.Mortensen14,15,16,BryanJ.Mowry85,132,MarkusM.Nöthen51,52,RoelA.Ophoff33,38,77,MichaelJ.Owen6,7,AarnoPalotie3,79,145,CarlosN.Pato103,TraceyL.Petryshen3,55,169,DaniellePosthuma170,171,172,MarcellaRietschel74,BrienP.Riley23,DanRujescu81,82,PamelaSklar78,89,148,DavidSt.Clair173,JamesT.R.Walters6,ThomasWerge16,25,174,PatrickF.Sullivan24,47,143,MichaelCO’Donovan6,7†,StephenW.Scherer1,175†,BenjaminM.Neale2,3,79,84†,JonathanSebat4,5,176†*theseauthorscontributedequally†theseauthorsco-supervisedthestudyCorrespondence:jsebat@ucsd.edu1TheCentreforAppliedGenomicsandPrograminGeneticsandGenomeBiology,TheHospitalforSickChildren,Toronto,ON,Canada2AnalyticandTranslationalGeneticsUnit,MassachusettsGeneralHospital,Boston,Massachusetts02114,USA3StanleyCenterforPsychiatricResearch,BroadInstituteofMITandHarvard,Cambridge,Massachusetts02142,USA4BeysterCenterforPsychiatricGenomics,UniversityofCalifornia,SanDiego,LaJolla,CA92093,USA5DepartmentofPsychiatry,UniversityofCalifornia,SanDiego,LaJolla,CA92093,USA6MRCCentreforNeuropsychiatricGeneticsandGenomics,InstituteofPsychologicalMedicineandClinicalNeurosciences,SchoolofMedicine,CardiffUniversity,Cardiff,CF244HQ,UK7NationalCentreforMentalHealth,CardiffUniversity,Cardiff,CF244HQ,UK8DepartmentofPsychiatry,IcahnSchoolofMedicineatMountSinai,NewYork,NewYork10029,USA9DepartmentofGeneticsandGenomicSciences,SeaverAutismCenter,TheMindichChildHealth&DevelopmentInstitute,IcahnSchoolofMedicineatMountSinai,NewYork,NewYork10029,USA10NeuroscienceDiscoveryandTranslationalArea,PharmaResearch&EarlyDevelopment,F.Hoffmann-LaRocheLtd,CH-4070Basel,Switzerland11NORMENT,KGJebsenCentreforPsychosisResearch,InstituteofClinicalMedicine,UniversityofOslo,0424Oslo,Norway12DepartmentofPsychiatry,DiakonhjemmetHospital,0319Oslo,Norway
3
13DepartmentofClinicalNeuroscience,PsychiatrySection,KarolinskaInstitutet,SE-17176Stockholm,Sweden14NationalCentreforRegister-basedResearch,AarhusUniversity,DK-8210Aarhus,Denmark15CentreforIntegrativeRegister-basedResearch,CIRRAU,AarhusUniversity,DK-8210Aarhus,Denmark16TheLundbeckFoundationInitiativeforIntegrativePsychiatricResearch,iPSYCH,Denmark17StateMentalHospital,85540Haar,Germany18DepartmentofPsychiatryandBehavioralSciences,StanfordUniversity,Stanford,California94305,USA19DepartmentofPsychiatryandBehavioralSciences,EmoryUniversity,Atlanta,Georgia30322,USA20DepartmentofPsychiatryandBehavioralSciences,AtlantaVeteransAffairsMedicalCenter,Atlanta,Georgia30033,USA21SchoolofBiomedicalSciencesandPharmacy,UniversityofNewcastle,CallaghanNSW2308,Australia22HunterMedicalResearchInstitute,NewLambton,NewSouthWales,Australia23VirginiaInstituteforPsychiatricandBehavioralGenetics,DepartmentofPsychiatry,VirginiaCommonwealthUniversity,Richmond,Virginia23298,USA24DepartmentofMedicalEpidemiologyandBiostatistics,KarolinskaInstitutet,StockholmSE-17177,Sweden25InstituteofBiologicalPsychiatry,MentalHealthCentreSct.Hans,MentalHealthServicesCopenhagen,DK-4000,Denmark26DepartmentofPsychiatry,UniversityofIowaCarverCollegeofMedicine,IowaCity,Iowa52242,USA27UniversityMedicalCenterGroningen,DepartmentofPsychiatry,UniversityofGroningen,NL-9700RB,TheNetherlands28SchoolofNursing,LouisianaStateUniversityHealthSciencesCenter,NewOrleans,Louisiana70112,USA29CenterforBrainScience,HarvardUniversity,Cambridge,Massachusetts02138,USA30DepartmentofPsychiatry,MassachusettsGeneralHospital,Boston,Massachusetts02114,USA31AthinoulaA.MartinosCenter,MassachusettsGeneralHospital,Boston,Massachusetts02129,USA32DepartmentofPsychiatry,UniversityofCaliforniaatSanFrancisco,SanFrancisco,California,94143USA33UniversityMedicalCenterUtrecht,DepartmentofPsychiatry,RudolfMagnusInstituteofNeuroscience,3584Utrecht,TheNetherlands34DepartmentofHumanGenetics,IcahnSchoolofMedicineatMountSinai,NewYork,NewYork10029,USA35SchizophreniaResearchInstitute,SydneyNSW2010,Australia36PriorityCentreforTranslationalNeuroscienceandMentalHealth,UniversityofNewcastle,NewcastleNSW2300,Australia37CentreHospitalierduRouvrayandINSERMU1079FacultyofMedicine,76301Rouen,France38DepartmentofHumanGenetics,DavidGeffenSchoolofMedicine,UniversityofCalifornia,LosAngeles,California90095,USA39SchoolofPsychiatry,UniversityofNewSouthWales,SydneyNSW2031,Australia40RoyalBrisbaneandWomen'sHospital,UniversityofQueensland,BrisbaneQLD4072,Australia41DepartmentofComputerScience,UniversityofNorthCarolina,ChapelHill,NorthCarolina27514,USA42DepartmentofPsychiatry,WashingtonUniversity,St.Louis,Missouri63110,USA43DepartmentofChildandAdolescentPsychiatry,AssistancePubliqueHospitauxdeParis,PierreandMarieCurieFacultyofMedicineandInstituteforIntelligentSystemsandRobotics,Paris,75013,France44NeuropsychiatricGeneticsResearchGroup,DepartmentofPsychiatry,TrinityCollegeDublin,Dublin8,Ireland45UniversityHospitalMarquésdeValdecilla,InstitutodeFormacióneInvestigaciónMarquésdeValdecilla,UniversityofCantabria,E-39008Santander,Spain46CentroInvestigaciónBiomédicaenRedSaludMental,Madrid,Spain47DepartmentofGenetics,UniversityofNorthCarolina,ChapelHill,NorthCarolina27599-7264,USA48DepartmentofPsychologicalMedicine,QueenMaryUniversityofLondon,LondonE11BB,UK49MolecularPsychiatryLaboratory,DivisionofPsychiatry,UniversityCollegeLondon,LondonWC1E6JJ,UK50ShebaMedicalCenter,TelHashomer52621,Israel51InstituteofHumanGenetics,UniversityofBonn,D-53127Bonn,Germany52DepartmentofGenomics,LifeandBrainCenter,D-53127Bonn,Germany53AppliedMolecularGenomicsUnit,VIBDepartmentofMolecularGenetics,UniversityofAntwerp,B-2610Antwerp,Belgium54VABostonHealthCareSystem,Brockton,Massachusetts02301,USA55DepartmentofPsychiatry,HarvardMedicalSchool,Boston,Massachusetts02115,USA
4
56DepartmentofBiomedicine,AarhusUniversity,DK-8000AarhusC,Denmark57CentreforIntegrativeSequencing,iSEQ,AarhusUniversity,DK-8000AarhusC,Denmark58FirstDepartmentofPsychiatry,UniversityofAthensMedicalSchool,Athens11528,Greece59DepartmentofPsychiatry,UniversityCollegeCork,Co.Cork,Ireland60DepartmentofMedicalGenetics,OsloUniversityHospital,0424Oslo,Norway61CognitiveGeneticsandTherapyGroup,SchoolofPsychologyandDisciplineofBiochemistry,NationalUniversityofIrelandGalway,Co.Galway,Ireland62DepartmentofPsychiatryandBehavioralSciences,NorthShoreUniversityHealthSystem,Evanston,Illinois60201,USA63DepartmentofPsychiatryandBehavioralNeuroscience,UniversityofChicago,Chicago,Illinois60637,USA64DepartmentofNon-CommunicableDiseaseEpidemiology,LondonSchoolofHygieneandTropicalMedicine,LondonWC1E7HT,UK65DepartmentofPsychiatry,UniversityofRegensburg,93053Regensburg,Germany66FolkhälsanResearchCenter,Helsinki,Finland,BiomedicumHelsinki1,Haartmaninkatu8,FI-00290,Helsinki,Finland67NationalInstituteforHealthandWelfare,P.O.BOX30,FI-00271Helsinki,Finland68DepartmentofGeneralPractice,HelsinkiUniversityCentralHospital,UniversityofHelsinkiP.O.BOX20,Tukholmankatu8B,FI-00014,Helsinki,Finland69TranslationalTechnologiesandBioinformatics,PharmaResearchandEarlyDevelopment,F.Hoffman-LaRoche,CH-4070Basel,Switzerland70MentalHealthServiceLine,WashingtonVAMedicalCenter,WashingtonDC20422,USA71DepartmentofPsychiatry,GeorgetownUniversity,WashingtonDC20057,USA72DepartmentofPsychiatry,VirginiaCommonwealthUniversity,Richmond,Virginia23298,USA73DepartmentofPsychiatry,KeckSchoolofMedicineatUniversityofSouthernCalifornia,LosAngeles,California90033,USA74DepartmentofGeneticEpidemiologyinPsychiatry,CentralInstituteofMentalHealth,MedicalFacultyMannheim,UniversityofHeidelberg,Heidelberg,D-68159Mannheim,Germany75DepartmentofGenetics,UniversityofGroningen,UniversityMedicalCentreGroningen,9700RBGroningen,TheNetherlands76DepartmentofPsychiatry,UniversityofColoradoDenver,Aurora,Colorado80045,USA77CenterforNeurobehavioralGenetics,SemelInstituteforNeuroscienceandHumanBehavior,UniversityofCalifornia,LosAngeles,California90095,USA78DivisionofPsychiatricGenomics,DepartmentofPsychiatry,IcahnSchoolofMedicineatMountSinai,NewYork,NewYork10029,USA79PsychiatricandNeurodevelopmentalGeneticsUnit,MassachusettsGeneralHospital,Boston,Massachusetts02114,USA80DepartmentsofPsychiatryandHumanGenetics,UniversityofChicago,Chicago,Illinois60637USA81DepartmentofPsychiatry,UniversityofHalle,06112Halle,Germany82DepartmentofPsychiatry,UniversityofMunich,80336,Munich,Germany83DepartmentsofPsychiatryandHumanandMolecularGenetics,INSERM,InstitutdeMyologie,HôpitaldelaPitiè-Salpêtrière,Paris,75013,France84MedicalandPopulationGeneticsProgram,BroadInstituteofMITandHarvard,Cambridge,Massachusetts02142,USA85QueenslandBrainInstitute,TheUniversityofQueensland,Brisbane,QLD4072,Australia86AcademicMedicalCentreUniversityofAmsterdam,DepartmentofPsychiatry,1105AZAmsterdam,TheNetherlands87Illumina,LaJolla,California,California92122,USA88J.J.PetersVAMedicalCenter,Bronx,NewYork,NewYork10468,USA89FriedmanBrainInstitute,IcahnSchoolofMedicineatMountSinai,NewYork,NewYork10029,USA90SchoolofElectricalEngineeringandComputerScience,UniversityofNewcastle,NewcastleNSW2308,Australia91DivisionofMedicalGenetics,DepartmentofBiomedicine,UniversityofBasel,Basel,CH-4058,Switzerland92DepartmentofGenetics,HarvardMedicalSchool,Boston,Massachusetts02115,USA
5
93DivisionofEndocrinologyandCenterforBasicandTranslationalObesityResearch,BostonChildren'sHospital,Boston,Massachusetts02115,USA94SectionofNeonatalScreeningandHormones,DepartmentofClinicalBiochemistry,ImmunologyandGenetics,StatensSerumInstitut,Copenhagen,DK-2300,Denmark95DepartmentofPsychiatry,FujitaHealthUniversitySchoolofMedicine,Toyoake,Aichi,470-1192,Japan96RegionalCentreforClinicalResearchinPsychosis,DepartmentofPsychiatry,StavangerUniversityHospital,4011Stavanger,Norway97CentreforMedicalResearch,TheUniversityofWesternAustralia,Perth,WA6009,Australia98SchoolofPsychiatryandClinicalNeurosciences,TheUniversityofWesternAustralia,Perth,WA6009,Australia99DepartmentofPsychology,UniversityofColoradoBoulder,Boulder,Colorado80309,USA100CampbellFamilyMentalHealthResearchInstitute,CentreforAddictionandMentalHealth,Toronto,Ontario,M5T1R8,Canada101DepartmentofPsychiatry,UniversityofToronto,Toronto,Ontario,M5T1R8,Canada102InstituteofMedicalScience,UniversityofToronto,Toronto,Ontario,M5S1A8,Canada103DepartmentofPsychiatryandZilkhaNeurogeneticsInstitute,KeckSchoolofMedicineatUniversityofSouthernCalifornia,LosAngeles,California90089,USA104DepartmentofChildandAdolescentPsychiatry,PierreandMarieCurieFacultyofMedicine,Paris75013,France105DepartmentofPsychiatry,Hadassah-HebrewUniversityMedicalCenter,Jerusalem91120,Israel106PsychologyResearchLaboratory,McLeanHospital,Belmont,MA107DepartmentofBiostatistics,JohnsHopkinsUniversityBloombergSchoolofPublicHealth,Baltimore,Maryland21205,USA108DepartmentofPsychiatry,ColumbiaUniversity,NewYork,NewYork10032,USA109DepartmentofMentalHealthandSubstanceAbuseServices,NationalInstituteforHealthandWelfare,P.O.BOX30,FI-00271Helsinki,Finland110DepartmentofMentalHealth,BloombergSchoolofPublicHealth,JohnsHopkinsUniversity,Baltimore,Maryland21205,USA111DepartmentofPsychiatry,UniversityofBonn,D-53127Bonn,Germany112CentreNationaldelaRechercheScientifique,LaboratoiredeGénétiqueMoléculairedelaNeurotransmissionetdesProcessusNeurodégénératifs,HôpitaldelaPitiéSalpêtrière,75013,Paris,France113DepartmentofGenomicsMathematics,UniversityofBonn,D-53127Bonn,Germany114ResearchUnit,SørlandetHospital,4604Kristiansand,Norway115DepartmentofPsychiatry,NationalUniversityofIrelandGalway,Co.Galway,Ireland116DivisionofPsychiatry,UniversityofEdinburgh,EdinburghEH164SB,UK117CentreforCognitiveAgeingandCognitiveEpidemiology,UniversityofEdinburgh,EdinburghEH164SB,UK118DivisionofMentalHealthandAddiction,OsloUniversityHospital,0424Oslo,Norway119MassachusettsMentalHealthCenterPublicPsychiatryDivisionoftheBethIsraelDeaconessMedicalCenter,Boston,Massachusetts02114,USA120EstonianGenomeCenter,UniversityofTartu,Tartu50090,Estonia121SchoolofPsychology,UniversityofNewcastle,NewcastleNSW2308,Australia122FirstPsychiatricClinic,MedicalUniversity,Sofia1431,Bulgaria123EliLillyandCompanyLimited,ErlWoodManor,SunninghillRoad,Windlesham,Surrey,GU206PHUK124DepartmentP,AarhusUniversityHospital,DK-8240Risskov,Denmark125MaxPlanckInstituteofPsychiatry,80336Munich,Germany126InstituteofTranslationalMedicine,UniversityofLiverpool,LiverpoolL693BX,UK127Munich127ClusterforSystemsNeurology(SyNergy),80336Munich,Germany128DepartmentofPsychiatry,RoyalCollegeofSurgeonsinIreland,Dublin2,Ireland129King'sCollegeLondon,LondonSE58AF,UK130MaastrichtUniversityMedicalCentre,SouthLimburgMentalHealthResearchandTeachingNetwork,EURON,6229HXMaastricht,TheNetherlands131DepartmentofPsychiatryandPsychotherapy,JenaUniversityHospital,07743Jena,Germany132QueenslandCentreforMentalHealthResearch,UniversityofQueensland,BrisbaneQLD4076,Australia133DepartmentofPsychiatryandBehavioralSciences,JohnsHopkinsUniversitySchoolofMedicine,Baltimore,Maryland21205,USA
6
134DepartmentofPsychiatry,TrinityCollegeDublin,Dublin2,Ireland135EliLillyandCompany,LillyCorporateCenter,Indianapolis,46285Indiana,USA136DepartmentofClinicalSciences,Psychiatry,UmeåUniversity,SE-90187Umeå,Sweden137DETECTEarlyInterventionServiceforPsychosis,Blackrock,Co.Dublin,Ireland138LawrenceBerkeleyNationalLaboratory,UniversityofCaliforniaatBerkeley,Berkeley,California94720,USA139CentreforPublicHealth,InstituteofClinicalSciences,Queen'sUniversityBelfast,BelfastBT126AB,UK140InstituteofPsychiatry,King'sCollegeLondon,LondonSE58AF,UK141MelbourneNeuropsychiatryCentre,UniversityofMelbourne&MelbourneHealth,MelbourneVIC3053,Australia142PublicHealthGenomicsUnit,NationalInstituteforHealthandWelfare,P.O.BOX30,FI-00271Helsinki,Finland143DepartmentofPsychiatry,UniversityofNorthCarolina,ChapelHill,NorthCarolina27599-7160,USA144CenterforBiologicalSequenceAnalysis,DepartmentofSystemsBiology,TechnicalUniversityofDenmark,DK-2800,Denmark145InstituteforMolecularMedicineFinland,FIMM,UniversityofHelsinki,P.O.BOX20FI-00014,Helsinki,Finland146DepartmentofEpidemiology,HarvardSchoolofPublicHealth,Boston,Massachusetts02115,USA147DepartmentofPsychiatry,UniversityofOxford,Oxford,OX37JX,UK148InstituteforMultiscaleBiology,IcahnSchoolofMedicineatMountSinai,NewYork,NewYork10029,USA149NeuroscienceTherapeuticArea,JanssenResearchandDevelopment,Raritan,NewJersey08869,USA150DepartmentofPsychiatryandPsychotherapy,UniversityofGöttingen,37073Göttingen,Germany151PsychiatryandPsychotherapyClinic,UniversityofErlangen,91054Erlangen,Germany152HunterNewEnglandHealthService,NewcastleNSW2308,Australia153DivisionofCancerEpidemiologyandGenetics,NationalCancerInstitute,Bethesda,Maryland20892,USA154ResearchandDevelopment,BronxVeteransAffairsMedicalCenter,NewYork,NewYork10468,USA155WellcomeTrustCentreforHumanGenetics,Oxford,OX37BN,UK156DepartmentofMedicalGenetics,UniversityMedicalCentreUtrecht,Universiteitsweg100,3584CG,Utrecht,TheNetherlands157BerkshireHealthcareNHSFoundationTrust,BracknellRG121BQ,UK158DepartmentofPsychiatry,UniversityofOulu,P.O.BOX5000,90014,Finland159UniversityHospitalofOulu,P.O.BOX20,90029OYS,Finland160MolecularandCellularTherapeutics,RoyalCollegeofSurgeonsinIreland,Dublin2,Ireland161HealthResearchBoard,Dublin2,Ireland162UniversityCollegeLondon,LondonWC1E6BT,UK163DepartmentofNeuroscience,IcahnSchoolofMedicineatMountSinai,NewYork,NewYork10029,USA164InstituteofNeuroscienceandMedicine(INM-1),ResearchCenterJuelich,52428Juelich,Germany165Social,GeneticandDevelopmentalPsychiatryCentre,InstituteofPsychiatry,King'sCollegeLondon,London,SE58AF,UK166DepartmentofGenetics,TheHebrewUniversityofJerusalem,91905Jerusalem,Israel167ThePerkinsInstituteforMedicalResearch,TheUniversityofWesternAustralia,Perth,WA6009,Australia168CentreforClinicalResearchinNeuropsychiatry,SchoolofPsychiatryandClinicalNeurosciences,TheUniversityofWesternAustralia,MedicalResearchFoundationBuilding,PerthWA6000,Australia169CenterforHumanGeneticResearchandDepartmentofPsychiatry,MassachusettsGeneralHospital,Boston,Massachusetts02114,USA170DepartmentofFunctionalGenomics,CenterforNeurogenomicsandCognitiveResearch,NeuroscienceCampusAmsterdam,VUUniversity,Amsterdam1081,TheNetherlands171DepartmentofComplexTraitGenetics,NeuroscienceCampusAmsterdam,VUUniversityMedicalCenterAmsterdam,Amsterdam1081,TheNetherlands172DepartmentofChildandAdolescentPsychiatry,ErasmusUniversityMedicalCentre,Rotterdam3000,TheNetherlands173UniversityofAberdeen,InstituteofMedicalSciences,Aberdeen,AB252ZD,UK174DepartmentofClinicalMedicine,UniversityofCopenhagen,Copenhagen2200,Denmark175DepartmentofMolecularGeneticsandMcLaughlinCentre,UniversityofToronto,Toronto,Ontario,Canada176DepartmentofCellularandMolecularMedicine,UniversityofCalifornia,SanDiego,LaJolla,CA92093,USA
7
Abstract
Genomiccopynumbervariants(CNVs)havebeenstronglyimplicatedintheetiologyof
schizophrenia(SCZ).However,apartfromasmallnumberofriskvariants,elucidationofthe
CNVcontributiontoriskhasbeendifficultduetotherarityofriskalleles,alloccurringinless
than1%ofcases.Wesoughttoaddressthisobstaclethroughacollaborativeeffortinwhichwe
appliedacentralizedanalysispipelinetoaSCZcohortof21,094casesand20,227controls.We
observedaglobalenrichmentofCNVburdenincases(OR=1.11,P=5.7e-15),whichpersistedafter
excludinglociimplicatedinpreviousstudies(OR=1.07,P=1.7e-6).CNVburdenisalsoenrichedfor
genesassociatedwithsynapticfunction(OR=1.68,P=2.8e-11)andneurobehavioralphenotypes
inmouse(OR=1.18,P=7.3e-5).Weidentifiedgenome-widesignificantsupportforeightloci,
including1q21.1,2p16.3(NRXN1),3q29,7q11.2,15q13.3,distal16p11.2,proximal16p11.2and
22q11.2.Wefindsupportatasuggestivelevelfornineadditionalcandidatesusceptibilityand
protectiveloci,whichconsistpredominantlyofCNVsmediatedbynon-allelichomologous
recombination(NAHR).
Introduction
Studiesofgenomiccopynumbervariation(CNV)haveestablishedaroleforraregenetic
variantsintheetiologyofSCZ1.TherearethreelinesofevidencethatCNVscontributetorisk
forSCZ:genome-wideenrichmentofraredeletionsandduplicationsinSCZcasesrelativeto
controls2,3,ahigherrateofdenovoCNVsincasesrelativetocontrols4-6,andassociation
evidenceimplicatingasmallnumberofspecificloci(Extendeddatatable1).AllCNVsthathave
beenimplicatedinSCZarerareinthepopulation,butconfersignificantrisk(oddsratios2-60).
Todate,CNVsassociatedwithSCZhavelargelyemergedfrommergersofsummarydata
forspecificcandidateloci7-9;yeteventhelargestgenome-widescans(samplesizestypically
<10,000)remainunder-poweredtorobustlyconfirmgeneticassociationforthemajorityof
pathogenicCNVsreportedsofar,particularlyforthosewithlowfrequencies(<0.5%incases)or
intermediateeffectsizes(oddsratios2-10).Itisimportanttoaddressthelowpowerof
systematicCNVstudieswithlargersamplesgiventhatthistypeofmutationhasalreadyproven
usefulforhighlightingsomeaspectsofSCZrelatedbiology6,10-13.
8
Thelimitedstatisticalpowerprovidedbysmallsamplesisasignificantobstaclein
studiesofrareandcommongeneticvariation.Inresponse,globalcollaborationshavebeen
formedinordertoattainlargesamplesizes,asexemplifiedbythestudyoftheSchizophrenia
WorkingGroupofthePsychiatricGenomicsConsortium(PGC)inwhich108independent
schizophreniaassociatedlociwereidentified14.Recognizingtheneedforsimilarlylarge
samplesinstudiesofCNVsforpsychiatricdisorders,weformedthePGCCNVAnalysisGroup.
Ourgoalwastoenablelarge-scaleanalysesofCNVsinpsychiatryusingcentralizedanduniform
methodologiesforCNVcalling,qualitycontrol,andstatisticalanalysis.Here,wereportthe
largestgenome-wideanalysisofCNVsforanypsychiatricdisordertodate,usingdatasets
assembledbytheSchizophreniaWorkingGroupofthePGC.
Dataprocessingandmeta-analyticmethods
Rawintensitydatawereobtainedfrom57,577subjectsfrom43separatedatasets
(Extendeddatatable2).AfterCNVcallingandqualitycontrol(QC),41,321subjectswere
retainedforanalysis.Inlargedatasetsderivedfrommultiplestudies,variabilityinCNV
detectionbetweenstudiesandarrayplatformspresentsasignificantchallenge.Tominimize
thetechnicalvariabilityacrossdifferentstudies,wedevelopedacentralizedpipelinefor
systematiccallingofCNVsforAffymetrixandIlluminaplatforms.(MethodsandExtendeddata
figure1).ThepipelineincludedmultipleCNVcallersruninparallel.DatafromIllumina
platformswereprocessedusingPennCNV15andiPattern16.DatafromAffymetrixplatforms
wereanalyzedusingPennCNVandBirdsuite17.Twoadditionalmethods,iPatternandC-score18,
wereappliedtodatafromtheAffymetrix6.0platform.TheCNVcallsfromeachprogramwere
convertedtoastandardizedformatandaconsensuscallsetwasconstructedbymergingCNV
outputsatthesamplelevel.OnlyCNVsegmentsthatweredetectedbyallalgorithmswere
retained.WeperformedrigorousQCattheplatformleveltoexcludesampleswithpoorprobe
intensityand/oranexcessiveCNVload(numberandlength).LargerCNVsthatappearedtobe
fragmentedweremergedandretained.CNVsspanningcentromeresorthosewith>50%
overlapwithsegmentalduplicationsorregionspronetoVDJrecombination(e.g.,
9
immunoglobulinorTcellreceptorloci)wereexcluded.Afinalsetofrare,highqualityCNVswas
definedasthose>20kbinlength,atleast10probes,and<1%MAF.
Geneticassociationswereinvestigatedbycase-controltestsofCNVburdenatfour
levels:(1)genome-wide(2)pathways,(3)genes,and(4)probes.AnalysescontrolledforSNP-
derivedprincipalcomponents,sex,genotypingplatform,andindividual-levelprobeintensity.
Multiple-testingthresholdsforgenome-widesignificancewereestimatedfromfamily-wise
errorratesdrawnfrompermutation
GenomewideanalysisofCNVburdenrevealsanenrichmentofultra-rarevariants
AnelevatedburdenofrareCNVshasbeenwellestablishedamongSCZcases2.We
appliedourmeta-analyticframeworktomeasuretheconsistencyofoverallCNVburdenacross
thegenotypingplatforms,andwhetherameasurableamountofCNVburdenpersistsoutsideof
previouslyimplicatedCNVregions.Consistentwithpreviousestimates,theoverallCNVburden
issignificantlygreateramongSCZcaseswhenmeasuredastotalKbcovered(OR=1.12,p=5.7e-
15),genesaffected(OR=1.21,p=6.6e-21),orCNVnumber(OR=1.03,p=1e-3).Focusingongenes
affectedbyCNV,ourstrongestsignalofenrichment,theeffectsizeisconsistentacrossall
genotypingplatforms(Figure1A).WhenwesplitbyCNVtype,theeffectsizeforcopynumber
losses(OR=1.40,p=4e-16)isgreaterthanforgains(OR=1.12,p=2e-7)(Extendeddatafigures2-
3).PartitioningbyCNVfrequency(basedon50%reciprocaloverlapwiththefullcallset,
Methods),CNVburdenisenrichedamongcasesacrossarangeoffrequencies,uptocountsof
80(MAF=0.2%)inthecombinedsample(Figure1B).
AprimaryquestioninthisstudyisthecontributionofnovellocitotheexcessCNV
burdenincases.AfterremovingninepreviouslyimplicatedCNVloci(wherereportedp-values
exceedourdesignatedmultipletestingthreshold,Extendeddatatable1),excessCNVburden
inSCZremainssignificantlyenriched(genesaffectedOR=1.11,p=1.3e-7,Figure1B).CNV
burdenalsoremainedsignificantlyenrichedafterremovalofallreportedlocifromExtended
datatable1,buttheeffect-sizewasgreatlyreduced(OR=1.08)comparedtotheenrichment
overall(OR=1.21).WhenwepartitionCNVburdenbyfrequency,wefindthatmuchofthe
10
previouslyunexplainedsignalisrestrictedtocomparativelyrareevents(i.e.,MAF<0.1%,Figure
1B).
Gene-set(pathway)burden
WeassessedwhetherCNVburdenwasconcentratedwithindefinedsetsofgenesinvolvedin
neurodevelopmentorneurologicalfunction.Atotalof36gene-setswereevaluated(fora
descriptionseeExtendeddatatable3),consistingofgene-setsrepresentingneuronalfunction,
synapticcomponentsandneurologicalandneurodevelopmentalphenotypesinhuman(19
sets),gene-setsbasedonbrainexpressionpatterns(7sets),andhumanorthologsofmouse
geneswhosedisruptioncausesphenotypicabnormalities,includingneurobehavioraland
nervoussystemabnormality(10sets).Somegene-setscanbeconsidered“negativecontrols”,
includinggenesnotexpressedinbrain(1set)orassociatedwithabnormalphenotypesin
mouseorgansystemsunrelatedtobrain(7sets).WemappedCNVstogenesiftheyoverlapped
byatleastoneexonicbp.
Gene-setburdenwastestedusinglogisticregressiondeviancetest6.Inadditiontousing
thesamecovariatesincludedingenome-wideburdenanalysis,wecontrolledforthetotal
numberofgenespersubjectspannedbyrareCNVstoaccountforsignalthatmerelyreflects
theglobalenrichmentofCNVburdenincases19.Multiple-testingcorrection(Benjamini-
HochbergFalseDiscoveryRate,BH-FDR)wasperformedseparatelyforeachgene-setgroupand
CNVtype(gains,losses).Aftermultipletestcorrection(Benjamini-HochbergFDR≤10%)15
gene-setswereenrichedforrarelossburdenincasesand4forraregainsincases,allofwhich
arebrain-relatedgenesets(Figure2).
Ofthe15setssignificantforlosses,themajorityconsistofsynapticorotherneuronal
components(9sets)fromgene-setgroup(a);inparticular,“GOsynaptic”(GO:0045202)and
“ARCcomplex”rankfirstbasedonstatisticalsignificanceandeffect-sizerespectively(“GO
synaptic”deviancetestp-value=2.8e-11,“ARCcomplex”regressionodds-ratio>1.8,Figure
2a).Lossesincaseswerealsosignificantlyenrichedforgenesinvolvedinnervoussystemor
behavioralphenotypesinmousebutnotforgene-setsrelatedtootherorgansystem
phenotypes(Figure2c).Toaccountfordependencybetweensynapticandneuronalgene-sets,
11
were-testedlossburdenfollowingastep-downlogisticregressionapproach,rankinggene-sets
basedonsignificanceoreffectsize(Extendeddatatable4).OnlyGOsynapticandARCcomplex
weresignificantinatleastoneofthetwostep-downanalyses,suggestingthatburden
enrichmentintheotherneuronalcategoriesismostlyaccountedbytheoverlapwithsynaptic
genes.Followingthesameapproach,themouseneurological/neurobehavioralphenotypeset
remainednominallysignificant,pointingtotheexistenceofadditionalsignalnotcapturedby
thesynapticset.Pathwayenrichmentwaslesspronouncedforduplications,consistentwiththe
smallerburdeneffectsforthisclassofCNV.Duplicationburdenwassignificantlyenrichedfor
NMDAreceptorcomplex,highlybrain-expressedgenes,medium/lowbrain-expressedgenes
andprenatallyexpressedbraingenes(Figure2b).
Giventhatsynapticgenesetswererobustlyenrichedfordeletionsincases,andwithan
appreciablecontributionfromlocithathavenotbeenstronglyassociatedwithSCZpreviously,
pathway-levelinteractionsofthesesetswerefurtherinvestigated.Aprotein-interaction
networkwasseededusingthesynapticandARCcomplexgenesthatwereintersectedbyrare
deletionsinthisstudy(Figure3).Agraphofthenetworkhighlightsmultiplesubnetworksof
synapticproteinsincludingpre-synapticadhesionmolecules(NRXN1,NRXN3),post-synaptic
scaffoldingproteins(DLG1,DLG2,DLGAP1,SHANK1,SHANK2),glutamatergicionotropic
receptors(GRID1,GRID2,GRIN1,GRIA4),andcomplexessuchasDystrophinanditssynaptic
interactingproteins(DMD,DTNB,SNTB1,UTRN).AsubsequenttestoftheDystrophin
glycoproteincomplex(DGC)revealedthatdeletionburdenofthesynapticDGCproteins
(intersectionof“GODGC”GO:0016010and“GOsynapse”GO:0045202)wasenrichedincases
(DeviancetestP=0.05),butdeletionburdenofthefullDGCwasnotsignificant(P=0.69).
GeneCNVburden
TodefinespecificlocithatconferriskforSCZ,wetestedCNVburdenatthelevelofindividual
genes,usinglogisticregressiondeviancetestandthesamecovariatesincludedingenome-wide
burdenanalysis.TocorrectlyaccountforlargeCNVsthataffectmultiplegenes,weaggregated
adjacentgenesintosinglelociiftheircopynumberwashighlycorrelatedacrosssubjects.CNVs
weremappedtogenesiftheyoverlappedoneormoreexons.Thecriterionforgenome-wide
12
significanceusedtheFamily-WiseErrorRate(FWER)<0.05.Thecriterionforsuggestive
evidenceusedaBenjamini-HochbergFalseDiscoveryRate(BH-FDR)<0.05.
OfnineteenindependentCNVlociwithgene-basedBH-FDR<0.05,twowereexcluded
basedonCNVcallingaccuracyorevidenceofabatcheffect(SupplementaryInformation).The
seventeenlocithatremainedaftertheseadditionalQCstepsarelistedinTable1.P-valuesfor
thissummarytablewereobtainedbyre-runningourstatisticalmodelacrosstheentireregion
(SupplementaryResults).Theseseventeenlocirepresentasetofnovel(n=6),previously
reported(n=4),andpreviouslyimplicated(n=7)loci.Manhattanplotsofthegeneassociation
forlossesandgainsareprovidedinFigure4.Apermutation-basedfalsediscoveryrateyielded
similarestimatestoBH-FDR.
Eightlociattaingenome-widesignificance,includingcopynumberlossesat1q21.1,
2p16.3(NRXN1),3q29,15q13.3,16p11.2(distal)and22q11.2alongwithgainsat7q11.23and
16p11.2(proximal).Anadditionalninelocimeetcriterionforsuggestiveassociation.Basedon
ourestimationofFalseDiscoveryRates(BHandpermutations),weexpecttoobservelessthan
twoassociationsmeetingsuggestivecriteriabychance.
ProbelevelCNVburden
WithourcurrentsamplesizeanduniformCNVcalling,manyindividualCNVlocicanbe
testedwithadequatepowerattheprobelevel,potentiallyfacilitatingdiscoveryatafinergrain
thanlocus-widetests.TestsforassociationwereperformedateachCNVbreakpointusingthe
residualsofcase-controlstatusaftercontrollingforanalysiscovariates,withsignificance
determinedthroughpermutation.ResultsforlossesandgainsareshowninExtendeddata
figure4.FourindependentCNVlocisurpassgenome-widesignificance,allofwhichwerealso
identifiedinthegene-basedtest,includingthe15q13.2-13.3and22q11.21deletions,16p11.2
duplication,and1q21.1deletionandduplication.Whiletheselocirepresentlessthanhalfof
thepreviouslyimplicatedSCZloci,wedofindsupportforalllociwheretheassociation
originallyreportedmeetsthecriteriaforgenome-widecorrectioninthisstudy.Weexamined
associationamongallpreviouslyreportedlocishowingassociationtoSCZ,including12CNV
13
lossesand20CNVgains(Extendeddatatable5),and14ofthe33lociwereassociatedwithSCZ
atp<.05.
Whenaprobe-leveltestisapplied,associationsatsomelocibecomebetterdelineated.
Forinstance,TheNRXN1geneat2p16.3isaCNVhotspot,andexonicdeletionsofthisgeneare
significantlyenrichedinSCZ9,20.Inthislargesample,weobserveahighdensityof“non-
recurrent”deletionbreakpointsincasesandcontrols.Theprobe-levelManhattanplotrevealsa
sawtoothpatternofassociation,wherepeakscorrespondtotranscriptionalstartsitesand
exonsofNRXN1(Figure5).Thisexamplehighlightshow,withhighdiversityofallelesatasingle
locus,theassociationpeakmaybecomemorerefined,andinsomecasesconvergetoward
individualfunctionalelements.Similarly,ahighdensityofduplicationbreakpointsatpreviously
reportedSCZrisklocion16p13.2(http://bit.ly/1NPgIuq)and8q11.23(http://bit.ly/1PwdYTt)
exhibitpatternsofassociationthatbetterdelineategenesintheseregions.
[theaboveURLslinktoaPGCCNVbrowserdisplayoftherespectivegenomicregions.ThebrowsercanalsobeaccesseddirectlyatthefollowingURLhttp://pgc.tcag.ca/gb2/gbrowse/pgc_hg18/]
NovelrisklociarepredominantlyNAHR-mediatedCNVs
ManyCNVlocithathavebeenstronglyimplicatedinhumandiseasearehotspotsfor
non-allelichomologousrecombination(NAHR),aprocesswhichinmostcasesismediatedby
flankingsegmentalduplications21.ConsistentwiththeimportanceofNAHRingeneratingCNV
riskallelesforschizophrenia,mostofthelociinTable1areflankedbysegmentalduplications.
Afterexcludinglocithathavebeenimplicatedinpreviousstudies,weinvestigatedwhether
NAHRmutationalmechanismswerealsoenrichedamongnovelassociatedCNVs.Wedefineda
CNVas“NAHR”whenboththestartandendbreakpointislocatedwithinasegmental
duplication.AcrossalllociwithFDR<0.05inthegene-baseburdentest,NAHR-mediatedCNVs
weresignificantlyenriched,6.03-fold(P=0.008;Extendeddatafigure5),whencomparedtoa
nulldistributiondeterminedbyrandomizingthegenomicpositionsofassociatedgenes
14
(SupplementalMaterial).TheseresultssuggestthatnovelSCZCNVstendtooccurinregions
pronetohighratesofrecurrentmutation.
Discussion
ThepresentstudyofthePGCSCZCNVdatasetincludesthemajorityofallmicroarray
datathathasbeengeneratedingeneticstudiesofSCZtodate.Inthis,thebestbodyof
evidencetodatewithwhichtoevaluateCNVassociations,wefinddefinitiveevidenceforeight
lociandwefindsignificantevidenceforacontributionfromnovelCNVsconferringbothrisk
andprotection.Thecompleteresults,includingCNVcallsandstatisticalevidenceatthegeneor
probelevel,canbeviewedusingthePGCCNVbrowser(URLs).Ourdatasuggestthatthenovel
risklocithatcanbedetectedwithcurrentgenotypingplatformslieattheultra-rareendofthe
frequencyspectrumandstilllargersampleswillbeneededtoidentifythematconvincinglevels
ofstatisticalevidence.
Collectively,theeightSCZrisklocithatsurpassgenome-widesignificancearecarriedby
asmallfraction(1.4%)ofSCZcasesinthePGCsample.Weestimate0.85%ofthevariancein
SCZliabilityisexplainedbycarryingaCNVriskallelewithintheseloci(SupplementaryResults).
Asacomparison,3.4%ofthevarianceinSCZliabilityisexplainedbythe108genome-wide
significantlociidentifiedinthecompanionPGCGWASanalysis.Combined,theCNVandSNP
locithathavebeenidentifiedtodateexplainasmallproportion(<5%)ofheritability.
Thelargedatasethereprovidesanopportunitytoevaluatethestrengthofevidencefor
avarietyoflociwhereanassociationwithschizophreniahasbeenreportedpreviously.Of33
publishedfindingsfromtherecentliterature,wefindevidencefor14loci(P<0.05,Extended
datatable5);thus,nearlyhalfoftheexistingcandidatelociaresupportedbyourdata.
Howeverwealsofindalackofevidenceformany.Alackofstrongevidenceinthisdataset
(whichincludessamplesthatoverlapwithmanyofthepreviousstudies)mayinsomecases
simplyreflectthatstatisticalpowerislimitedforveryrarevariants,eveninlargesamples.
However,itislikelythatsomeoftheseoriginalfindingsrepresentspuriousassociations.
Indeed,thelocithatarenotsupportedbyourdataconsistlargelyoflociforwhichtheoriginal
statisticalevidencewasmodest(Extendeddatatable5).Thus,ourresultshelptorefinethelist
15
ofpromisingcandidateCNVs.Continuedeffortstoevaluatethegrowingnumberofcandidate
variantshasconsiderablevaluefordirectingfutureresearcheffortsfocusedonspecificloci.
Novelcandidatelocimeetingsuggestivecriteriainthisstudyhighlightstrongcandidate
locithathavenotbeenpreviouslyimplicatedinSCZ.TwosuchassociationsarelocatedontheX
chromosomeinaregionofXq28thatishighlypronetorecurrentrearrangements22-24
(Extendeddatafigure6).GainsatthedistalXq28locusareenrichedincasesinthisstudy;
similarduplicationshavebeenreportedinassociationwithintellectualdisability,while
reciprocaldeletionsofthisregionareassociatedwithembryoniclethalityinmales25.
DuplicationsattheproximalXq28locus,includingasinglegeneMAGEA11,areenrichedin
controlsinthisstudy,andtoourknowledgehavenotbeendocumentedinotherdisorders.
Weobservedmultiple“protective”CNVsthatshowedasuggestiveenrichmentin
controls,includingduplicationsof22q11.2,MAGEA11,andZMYM5alongwithdeletionsand
duplicationsofZNF92.Noprotectiveeffectsweresignificantaftergenome-widecorrection.
Moreover,arareCNVthatconfersreducedriskforSCZmaynotconferageneralprotection
fromneurodevelopmentaldisorders.Forexample,microduplicationsof22q11.2appearto
conferprotectionfromSCZ26;however,suchduplicationshavebeenshowntoincreaseriskfor
developmentaldelayandavarietyofcongenitalanomaliesinpediatricclinicalpopulations27.It
isprobablethatsomeoftheundiscoveredrareallelesinSCZarevariantsthatconferprotection
butlargersamplesizesareneededtodeterminethisunequivocally.Iftrue,ourestimatesofthe
excessCNVburdenincasesmaynotfullyaccountforthevariationSCZliabilitythatisexplained
byrareCNVs.
OurresultsprovidestrongevidencethatdeletionsinSCZareenrichedwithinahighly
connectednetworkofsynapticproteins,consistentwithpreviousstudies2,6,10,28.ThelargeCNV
datasethereallowsamoredetailedviewofthesynapticnetworkandhighlightssubsetsof
genesaccountfortheexcessdeletionburdeninSCZ,includingsynapticcelladhesionand
scaffoldingproteins,glutamatergicionotropicreceptorsandproteincomplexessuchastheARC
complexandDGC.ModestCNVevidenceimplicatingDystrophin(DMD)anditsbindingpartners
isintriguinggiventhattheinvolvementofcertaincomponentsoftheDGChavebeen
16
postulated29,30anddisputed31previously.LargerstudiesofCNVareneededtodefinearole
forthisandothersynapticsub-networksinSCZ.
Thisstudyrepresentsamilestone.Large-scalecollaborationsinpsychiatricgenetics
havegreatlyadvanceddiscoverythroughgenome-wideassociationstudies.Herewehave
extendedthisframeworktorareCNVs.Ourknowledgeofthecontributionfromlower
frequencyvariantsgivesusconfidencethattheapplicationofthisframeworktolargenewly
acquireddatasetshasthepotentialtofurtherthediscoveryoflociandidentificationofthe
relevantgenesandfunctionalelements.ThePGCCNVresourceisnowpubliclyavailable
throughacustombrowserathttp://pgc.tcag.ca/gb2/gbrowse/pgc_hg18/.
17
AuthorContributionsManagementofthestudy,coreanalysesandcontentofthemanuscriptwastheresponsibilityoftheCNVAnalysisGroupchairedbyJ.S.andjointlysupervisedbyS.W.S.andB.M.N.togetherwiththeSchizophreniaWorkingGroupchairedbyM.C.O’D.CoreanalyseswerecarriedoutbyD.P.H.,D.M.,andC.R.M.DataProcessingpipelinewasimplementedbyC.R.M.,B.T.,W.W.,D.G.,M.G.,A.S.andW.B.TheAcustomPGCCNVbrowserwasdevelopedbyC.R.M,D.P.H.,andB.T.AdditionalanalysesandinterpretationswerecontributedbyW.W.,D.A.andP.A.H.TheindividualstudiesorconsortiacontributingtotheCNVmeta-analysiswereledbyR.A.,O.A.A.,D.H.R.B.,A.D.B.,E.Bramon,J.D.B.,A.C.,D.A.C.,S.C.,A.D.,E.Domenici,H.E.,T.E.,P.V.G.,M.G.,H.G.,C.M.H.,N.I.,A.V.J.,E.G.J.,K.S.K.,G.K.,J.Knight,T.Lencz,D.F.L.,Q.S.L.,J.Liu,A.K.M.,S.A.M.,A.McQuillin,J.L.M.,P.B.M.,B.J.M.,M.M.N.,M.C.O’D.,R.A.O.,M.J.O.,A.Palotie,C.N.P.,T.L.P.,M.R.,B.P.R.,D.R.,P.C.S,P.Sklar.D.St.C.,P.F.S.,D.R.W.,J.R.W.,J.T.R.W.andT.W.Theremainingauthorscontributedtotherecruitment,genotyping,ordataprocessingforthecontributingcomponentsofthemeta-analysis.J.S.,B.M.N,C.R.M,D.P.H.,andD.M.draftedthemanuscript,whichwasshapedbythemanagementgroup.Allotherauthorssaw,hadtheopportunitytocommenton,andapprovedthefinaldraft.CompetingFinancialInterestSeveraloftheauthorsareemployeesofthefollowingpharmaceuticalcompanies:F.Hoffman-LaRoche(E.D.,L.E.),EliLilly(D.A.C.,Y.M.,L.N.)andJanssen(A.S.,Q.S.L).Noneofthesecompaniesinfluencedthedesignofthestudy,theinterpretationofthedata,theamountofdatareported,orfinanciallyprofitbypublicationoftheresults,whicharepre-competitive.Theotherauthorsdeclarenocompetinginterests.AcknowledgementsCorefundingforthePsychiatricGenomicsConsortiumisfromtheUSNationalInstituteofMentalHealth(U01MH094421).WethankT.LehnerandAnjeneAddington(NIMH).Theworkofthecontributinggroupswassupportedbynumerousgrantsfromgovernmentalandcharitablebodiesaswellasphilanthropicdonation.DetailsareprovidedintheSupplementaryNotes.MembershipoftheWellcomeTrustCaseControlConsortiumandPsychosisEndophenotypeInternationalConsortiumareprovidedintheSupplementaryNotes.URLsPGCCNVbrowser,http://pgc.tcag.ca/gb2/gbrowse/pgc_hg18.
18
References
1. Malhotra,D.&Sebat,J.CNVs:harbingersofararevariantrevolutioninpsychiatricgenetics.Cell148,1223-41(2012).
2. Walsh,T.etal.Rarestructuralvariantsdisruptmultiplegenesinneurodevelopmentalpathwaysinschizophrenia.Science320,539-43(2008).
3. TheInternationalSchizophrenia,C.Rarechromosomaldeletionsandduplicationsincreaseriskofschizophrenia.Nature.455,237-241(2008).
4. Malhotra,D.etal.HighfrequenciesofdenovoCNVsinbipolardisorderandschizophrenia.Neuron72,951-63(2011).
5. Xu,B.etal.Strongassociationofdenovocopynumbermutationswithsporadicschizophrenia.NatGenet40,880-5(2008).
6. Kirov,G.etal.DenovoCNVanalysisimplicatesspecificabnormalitiesofpostsynapticsignallingcomplexesinthepathogenesisofschizophrenia.Molecularpsychiatry17,142-53(2012).
7. McCarthy,S.E.etal.Microduplicationsof16p11.2areassociatedwithschizophrenia.NatGenet41,1223-7(2009).
8. Mulle,J.G.etal.Microdeletionsof3q29conferhighriskforschizophrenia.AmJHumGenet87,229-36(2010).
9. Rujescu,D.etal.Disruptionoftheneurexin1geneisassociatedwithschizophrenia.HumMolGenet(2008).
10. Pocklington,A.J.etal.NovelFindingsfromCNVsImplicateInhibitoryandExcitatorySignalingComplexesinSchizophrenia.Neuron86,1203-14(2015).
11. Horev,G.etal.Dosage-dependentphenotypesinmodelsof16p11.2lesionsfoundinautism.ProcNatlAcadSciUSA108,17076-81(2011).
12. Golzio,C.etal.KCTD13isamajordriverofmirroredneuroanatomicalphenotypesofthe16p11.2copynumbervariant.Nature485,363-7(2012).
13. Holmes,A.J.etal.Individualdifferencesinamygdala-medialprefrontalanatomylinknegativeaffect,impairedsocialfunctioning,andpolygenicdepressionrisk.JNeurosci32,18087-100(2012).
14. SchizophreniaWorkingGroupofthePsychiatricGenomics,C.Biologicalinsightsfrom108schizophrenia-associatedgeneticloci.Nature511,421-7(2014).
15. Wang,K.etal.PennCNV:anintegratedhiddenMarkovmodeldesignedforhigh-resolutioncopynumbervariationdetectioninwhole-genomeSNPgenotypingdata.GenomeRes17,1665-74(2007).
16. Pinto,D.etal.Functionalimpactofglobalrarecopynumbervariationinautismspectrumdisorders.Nature466,368-72(2010).
17. Korn,J.M.etal.IntegratedgenotypecallingandassociationanalysisofSNPs,commoncopynumberpolymorphismsandrareCNVs.Nat.Genet.40,1253-1260(2008).
18. Vacic,V.etal.DuplicationsoftheneuropeptidereceptorgeneVIPR2confersignificantriskforschizophrenia.Nature471,499-503(2011).
19. Raychaudhuri,S.etal.Accuratelyassessingtheriskofschizophreniaconferredbyrarecopy-numbervariationaffectinggeneswithbrainfunction.PLoSGenet6(2010).
20. Kirov,G.etal.ComparativegenomehybridizationsuggestsaroleforNRXN1andAPBA2inschizophrenia.HumMolGenet17,458-65(2008).
21. Lupski,J.R.Genomicdisorders:structuralfeaturesofthegenomecanleadtoDNArearrangementsandhumandiseasetraits.TrendsGenet14,417-22(1998).
19
22. Calhoun,A.R.&Raymond,G.V.DistalXq28microdeletions:clarificationofthespectrumofcontiguousgenedeletionsinvolvingABCD1,BCAP31,andSLC6A8withanewcaseandreviewoftheliterature.AmJMedGenetA164A,2613-7(2014).
23. El-Hattab,A.W.etal.Clinicalcharacterizationofint22h1/int22h2-mediatedXq28duplication/deletion:newcasesandliteraturereview.BMCMedGenet16,12(2015).
24. Ravn,K.etal.LargegenomicrearrangementsinMECP2.HumMutat25,324(2005).25. El-Hattab,A.W.etal.Int22h-1/int22h-2-mediatedXq28rearrangements:intellectualdisability
associatedwithduplicationsandinuteromalelethalitywithdeletions.JMedGenet48,840-50(2011).
26. Rees,E.etal.Evidencethatduplicationsof22q11.2protectagainstschizophrenia.MolPsychiatry19,37-40(2014).
27. VanCampenhout,S.etal.Microduplication22q11.2:adescriptionoftheclinical,developmentalandbehavioralcharacteristicsduringchildhood.GenetCouns23,135-48(2012).
28. Fromer,M.etal.Denovomutationsinschizophreniaimplicatesynapticnetworks.Nature506,179-84(2014).
29. Zatz,M.etal.CosegregationofschizophreniawithBeckermusculardystrophy:susceptibilitylocusforschizophreniaatXp21oraneffectofthedystrophingeneinthebrain?JMedGenet30,131-4(1993).
30. Straub,R.E.etal.Geneticvariationinthe6p22.3geneDTNBP1,thehumanorthologofthemousedysbindingene,isassociatedwithschizophrenia.AmJHumGenet71,337-48(2002).
31. Mutsuddi,M.etal.Analysisofhigh-resolutionHapMapofDTNBP1(Dysbindin)suggestsnoconsistencybetweenreportedcommonvariantassociationsandschizophrenia.AmJHumGenet79,903-9(2006).
32. Zuberi,K.etal.GeneMANIApredictionserver2013update.NucleicAcidsRes41,W115-22(2013).
20
CHR BP1 BP2 Locus(GENE)PutativeCNVMechanism CNVtest Direction FWER BH-FDR Cases Controls
Regionalp-value OddsRatio[95%CI]
22 17,400,000 19,750,000 22q11.21 NAHR loss risk yes 3.54E-15 64 1 5.70E-18 67.7[9.3-492.8]
16 29,560,000 30,110,000 16p11.2(proximal) NAHR gain risk yes 5.82E-10 70 7 2.52E-12 9.4[4.2-20.9]
2 50,000,992 51,113,178 2p16.3(NRXN1) NHEJ loss risk yes 3.52E-07 35 3 4.92E-09 14.4[4.2-46.9]
15 28,920,000 30,270,000 15q13.3 NAHR loss risk yes 2.22E-05 28 2 2.13E-07 15.6[3.7-66.5]
1 144,646,000 146,176,000 1q21.1 NAHR loss+gain risk yes 0.00011 60 14 1.50E-06 3.8[2.1-6.9]
3 197,230,000 198,840,000 3q29 NAHR loss risk yes 0.00024 16 0 1.86E-06 NA[0-Inf]
16 28,730,000 28,960,000 16p11.2(distal) NAHR loss risk yes 0.0029 11 1 5.52E-05 20.6[2.6-162.2]
7 72,380,000 73,780,000 7q11.23 NAHR gain risk yes 0.0048 16 1 1.68E-04 16.1[3.1-125.7]
X 153,800,000 154,225,000 Xq28(distal) NAHR gain risk no 0.049 18 2 3.61E-04 8.9[2.0-39.9]
22 17,400,000 19,750,000 22q11.21 NAHR gain protective no 0.024 3 16 4.54E-04 0.15[0.04-0.52]
7 64,476,203 64,503,433 7q11.21(ZNF92) NAHR loss+gain protective no 0.033 131 180 6.71E-04 0.66[0.52-0.84]
13 19,309,593 19,335,773 13q12.11(ZMYM5) NHAR gain protective no 0.024 15 38 7.91E-04 0.36[0.19-0.67]
X 148,575,477 148,580,720 Xq28(MAGEA11) NAHR gain protective no 0.044 12 36 1.06E-03 0.35[0.18-0.68]
15 20,350,000 20,640,000 15q11.2 NAHR loss risk no 0.044 98 50 1.34E-03 1.8[1.2-2.6]
9 831,690 959,090 9p24.3(DMRT1) NHEJ loss+gain risk no 0.049 13 1 1.35E-03 12.4[1.6-98.1]
8 100,094,670 100,958,984 8q22.2(VPS13B) NHEJ loss risk no 0.048 7 1 1.74E-03 14.5[1.7-122.2]
7 158,145,959 158,664,998
7p36.3
(VIPR2/WDR60) NAHR loss+gain risk no 0.046 20 6 5.79E-03 3.5[1.3-9.0]
Table1:SignificantCNVlocifromgene-basedassociationtest
AllseventeenlocilistedcontainatleastonegenewithBenjamini-Hochbergfalsediscoveryrate(BH-FDR)<0.05inthegene-based
test,witheightlocicontainingatleastonegenesurpassingthefamily-wiseerrorrate(FWER)<0.05.Genomicpositionslistedare
usinghg18coordinates.ForputativeCNVmechanisms,non-allelichomologousrecombination(NAHR)andnon-homologousend
joining(NHEJ)arelistedasthelikelygenomicfeaturedrivingCNVformationateachlocus.Regionalp-valuesandoddsratioslisted
arefromaregionaltestateachlocus,wherewecombineCNVoverlappingtheimplicatedregionandrunthesametestasusedfor
eachgene(logisticregressionwithcovariatesanddeviancetestp-value).
21
●
●
●
●
●
●
●
●
Odds Ratio (95% CI)
0.5 1.0 1.5 2.0 2.5 3.0
●
●
●
●
●
●
●
PGC
omni
A6.0
I550
A5.0
I600
I300
A500
21094
9795
7027
1572
1130
818
410
314
20227
8422
7980
1705
1001
623
285
201
2.2
2.1
2.4
1.9
1.7
1.3
2
1.7
6.6e−21
5.8e−11
1.6e−05
0.004
3.6e−06
0.009
0.461
0.039
1.21
1.27
1.14
1.31
1.47
1.47
1.16
1.75
Platform Cases Controls Genes pval OR
AffymetrixIllumina
A
singleton 2−5 6−10 11−20 21−40 41−80 81+
CNV count
Odd
s Ra
tio (9
5% C
I)
1.0
1.5
2.0
all CNVpreviously implicated CNV removed
B
22
Figure1.CNVBurden
(A)ForestplotofCNVburden(measuredhereasgenesaffectedbyCNV),partitionedby
genotypingplatform,withthefullPGCsampleatthebottom.CNVburdeniscalculatedby
combiningCNVgainsandlosses.Caseandcontrolcountsarelisted,and“genes”istherateof
genesaffectedbyCNVincontrols.BurdentestsusealogisticregressionmodelpredictingSCZ
case/controlstatusbyCNVburdenalongwithcovariates(seemethods).Theoddsratioisthe
exponentialofthelogisticregressioncoefficient,andoddsratiosaboveonepredictincreased
SCZrisk.(B)CNVburdenpartitionedbyCNVfrequency.Forreference,aCNVwithMAF0.1%in
thePGCsamplewouldhave~41CNVs.Usingthesamemodelasabove,eachCNVwasplaced
intoasingleCNVfrequencycategorybasedona50%reciprocaloverlapwithotherCNVs.CNV
burdenwithinclusionofallCNVsareshowningreen,whereasCNVburdenexcludingpreviously
implicatedCNVlociareshowninblue.
23
Figure2:Gene-setBurden
Gene-setburdentestresultsforrarelosses(a,c)andgains(b,d);framesa-bdisplaygene-sets
forneuronalfunction,synapticcomponents,neurologicalandneurodevelopmentalphenotypes
inhuman;framesc-ddisplaygene-setsforhumanhomologsofmousegenesimplicatedin
abnormalphenotypes(organizedbyorgansystems);botharesortedby–log10ofthelogistic
regressiondeviancetestp-valuemultipliedbythebetacoefficientsign,obtainedforrarelosses
whenincludingknownloci.Gene-setspassingthe10%BH-FDRthresholdaremarkedwith“*”.
!"#$$%& !'()*%&
Synaptic GO (622)
Neurof. Str. (1424)
Neuron Proj. GO (1230)
ARC Kirov (28)
FMR1 Targ. Darnell (840)
Neurof. Incl. (2874)
Nerv. Transm. GO (716)
Nerv. Sys. Dev. GO (1874)
FMR1 Targ. Ascano (927)
PSD Bayes full (1407)
Axon Guid. Pathw. (388)
Synapt. Pathw. KEGG (407)
Mind Phen. ADX (153)
Nerv. Sys. Phen. Any (1590)
NMDAR Kirov (62)
CNS Dev. GO (774)
Nerv. Sys. Phen. ADX (651)
Neuron Body GO (309)
Mind Phen. Any (439)
GA
IN: N
eurof+Synaptic: S
ignificance: U-Log (Dev P-value) * sign (Coeff)
-2 0 2 4 6 8 10
*
Synaptic GO (622)
Neurof. Str. (1424)
Neuron Proj. GO (1230)
ARC Kirov (28)
FMR1 Targ. Darnell (840)
Neurof. Incl. (2874)
Nerv. Transm. GO (716)
Nerv. Sys. Dev. GO (1874)
FMR1 Targ. Ascano (927)
PSD Bayes full (1407)
Axon Guid. Pathw. (388)
Synapt. Pathw. KEGG (407)
Mind Phen. ADX (153)
Nerv. Sys. Phen. Any (1590)
NMDAR Kirov (62)
CNS Dev. GO (774)
Nerv. Sys. Phen. ADX (651)
Neuron Body GO (309)
Mind Phen. Any (439)
LOSS: N
eurof+Synaptic: Significance: U
-Log (Dev P-value) * sign (Coeff)
-2 0 2 4 6 8 10
*
*
*********
**
Neurol. Behav. (2123)
Neuro Union (3202)
Nerv. Sys. (2375)
Endocr. Exocr. Repr. (2026)
Integ. Adip. Pigm. (1624)
Hemat. Immune (2605)
Digest. Hepat. (1493)
Cardiov. Muscle (2059)
Sensory (1293)
Skel. Cran. Limbs (1588)
LOSS: M
ouse Phenotypes: Significance: U
-Log (Dev P-value) * sign (Coeff)
-2 0 2 4 6 8 10***
*
Neurol. Behav. (2123)
Neuro Union (3202)
Nerv. Sys. (2375)
Endocr. Exocr. Repr. (2026)
Integ. Adip. Pigm. (1624)
Hemat. Immune (2605)
Digest. Hepat. (1493)
Cardiov. Muscle (2059)
Sensory (1293)
Skel. Cran. Limbs (1588)
GA
IN: M
ouse Phenotypes: S
ignificance: U
-Log (Dev P-value) * sign (Coeff)
-2 0 2 4 6 8 10
$+,-./&01,&2345607.8& $+,-./&01,&2345607.8&
9+:;&+<30+=6:./&01=+&
9+:;17:&+<30+=6:./&01=+&
24
Gene-setsrepresentingbrainexpressionpatternswereomittedfromthefigurebecauseonlya
fewweresignificant(losses:1,gains:3).
25
Figure3:ProteinInteractionNetworkforSynapticGenes
SynapticandARC-complexgenesintersectedbyararelossinatleast4caseorcontrolsubjects
andwithgenicburdenBenjamini-HochbergFDR<=25%(reddiscs)wereusedtoquery
GeneMANIA32andretrieveadditionalproteininteractionneighbors,resultinginanetworkof
136synapticgenes.Genesaredepictedasdisks;diskcentersarecoloredbasedonrareloss
frequency(Freq.SZandFreq.CT)beingprevalentincasesorcontrols;diskbordersarecolored
tomark(i)geneimplicationinhumandominantorX-linkedneurologicalor
neurodevelopmentalphenotype,(ii)denovomutation(DeN)reportedbyFromeretal.28,split
betweenLOF(frameshift,stop-gain,coresplicesite)andmissenseoraminoacidinsertion/
deletion,(iii)implicationinmouseneurobehavioralabnormality.Pre-synapticadhesion
molecules(NRXN1,NRXN3),post-synapticscaffolds(DLG1,DLG2,DLGAP1,SHANK1,SHANK2)
SYT2
SYT7
SYT4
APBA1
EPS8
SYT1SYT6
RPH3A
BAIAP2
CASK
NRXN2
NRXN3
SYT5SYT9
WNT7B
NRXN1
F2R
WNT5A
FMR1BSN
CYFIP1GRIA4
HOMER2
LIN7C
NLGN2NLGN3 GRIK1
SIPA1L1 LIN7BNLGN1 AXIN1
SPTBN2
ABI1
DVL1
MINK1
CYFIP2
LRP6
ARRB2
LRRK2
LPHN1TANC1
SHANK1 GRIK2
LIN7A
GRIA1
DTNACADPS2
SNTB1CADPS
SYNC
DES
PTK2
PLXND1
ITSN1
DMD
ANK2
DAG1
SEMA3E
L1CAM
PFN2
DLGAP3
MPDZ
HTR2B
NOS1SHANK3
SNTA1
UTRN
SNTB2
NRG1
DLGAP2
DTNB
GRID2
GRIN2B
DLGAP1
DLG4
CACNG2GOPC
SHANK2
GRIN1GRIN2D
DLG1DLG3
GRIN2C
CNKSR2
GIPC1
ATP2B2
LRP8
DLG2
GRID1
CRIPT
NETO1
SEMA4C
SEPT2
SEPT11
SEPT7
VAMP7
SEPT5
SEPT6
SNAP29
UNC13A FBXO45
SNAP23
WASF1
AP3D1
SNCB
STX1A
SNPH
CPLX2
APBB1MYO5A
CAMK2AGRIN2A
CTNNB1MYO6
CTNND1
NEDD4
MAGI2
MDM2
APP
CHRNA7
CDH1
CTNND2
P2RX6
ARCCHRFAM7A
ERBB2
DLGAP4
FYN
PTEN
NFASC
CNTN2
RAPGEF2
Gene$associa*on$test$deviance$p/value$(loss)$
BH/FDR$<=$25%,$Freq.SZ$>$Freq.CT$
BH/FDR$>$25%,$Freq.SZ$>=$1.5$*$Freq.CT$$
Other$
Freq.CT$>=$1.5$*$Freq.SZ$$
Dominant$/$X/linked,$or$DeN$LOF$
DeN$Missense$or$aa$in/del$
No$evidence$
Mouse$Neurophenotype$
26
andglutamatergicionotropicreceptors(GRID1,GRID2,GRIN1,GRIA4)constituteahighly
connectedsubnetworkwithmorelossesincasesthancontrols.
27
Figure4:GeneBasedManhattanPlot.
Manhattanplotdisplayingthe–log10deviancep-valuefor(A)CNVlossesand(B)CNVgainsthe
gene-basedtest.P-valuecutoffscorrespondingtoFWER<0.05andBH-FDR<0.05are
highlightedinredandblue,respectively.Locisignificantaftermultipletestcorrectionare
labeled.
05
1015
Chromosome
−lo
g 10(p)
● ●
●
●●●●●●●●●●●●●●●●
●
●●●●●●●●●
●●●●●●●●●●●●●●●●●
●
●
●
●●●●
●
●
●
●●
●●●●
●●●●●● ●
●●●●●
●●●
●●
●
● ●●●
●●
●●●●●●●●●●●●●●●●●●
●●●
●●
●●
●●●●●
●●
●●●●●●●●●● ●●●
●●
●●●
●●
●●●
●●
●●●●●●●●●●
●
● ●
●
●●
●●●●
●
●
●●
●
●
●
●●●●●
●
●●●●●●●●●●●●●
●
●●●
●●
●
●●●●●●●
●
●
●●●
●●●●● ●●●●
●
●●●●
●
●
●●●●●●●●●●●●●●●●●
●
●●
●●●●●●
●
●●●●●●
●●
●●●●●●●●●●●●●
●
●●●●●●●●●
●●
●●●
●●●●●●●●
●●●●●●●●●●●●●●
●●●●● ●●●
●
●●●●
●
●
●●●●●
●
●●●●●●●●●●
●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●
●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●
●●●●
●●●●●
●●●●●
●●
●
●●●●
●
●
●●●●●●●●●●●●●●●
●●●
●●
●●●
●●●
●●●●●●●●
●
●●●●●●●●●●●●●●●
●●●● ●●●
●
●
●●●●●●●●●●●●●●●●
●
●
●
●
●●●●●●
●
●●●●
●●●●●●●●●
●
●
●
●●●●
●●●●●
●
● ●
●●
●●●
●
●●
1 3 5 7 9 11 13 15 17 19 21 X2 4 6 8 10 12 14 16 18 20 22
CNV losses FWER cutoff = 1.33e−4CNV losses FDR cutoff = 0.0025
1q21.1
2p16.3
3q29
8q22.210q11.12
15q13.3
15q11.2
16p11.2 (distal)
22q11.2
A0
510
15
Chromosome
−lo
g 10(p)
●
●●●●●●●
●●●●●●●●●●●●●
●●●●●●●●
●
●
●
●
●●●●●
●●
●●●●●●●●●●●●●●●●●●●●●●
●
●●●
●●●●●●
●
●●
●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●
●
●
●●●●●●●●
●
●●●● ●●●●●●●●●●●●●●
●
●●●●●●●
●
●
●●
●
●●●●●●
●
●
●
●●
●
●●●
●●
●
●●●●
●
●
●●●●●●●●●
●●●●●
●●
●●●●
●●●●
●●●
●●●●●●●●●●
●●●●●●
●●●●●●
●
●●●●●●●
●
●●●●●●●●●●●
●●● ●
●
●●●●●●●●●●●●●●
●
●
●
●
●●●●●●
●●●●●●●●●
●
●●
●●●●●●●●●●
●
●●●
●●
●●
●●
●●
●
●●●●●●●
●
●
●●●●
●●
●
●●●●●●●●●●●●●●●
●
●●●●
●
●●●●●
●●●●●●●●●●●●●●●●●●●●
●●●
●●●●●●●●●●●●●●●
●●●●
●●●
●●●●●●●●●●●●●●
●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●
●●●●
●●●●●
●●●●●●●●
●●
●●●●●●●●●●●● ●●
●●●
●●●●●
●●●●●
●●
●
●
●●●●●●●
●●●●
●
●
●●●●●
●●●●● ●●●●●●●●●
●●●
●●●●●
●
●
●●
●●
●●●
●●●●
●
●
●●
●●●●●●●●●●●●●●●●
●
●
●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●
●
●
●
●
●●●●●●●●
●●●●●●●●●●●
●
●
●
●
●●●●●●●●
●
●
●
●●
●
●●●●
●●●●
●●●●
●●●
●●
●
●●
●●●●●●●●
●●●
●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●
●●●●●●●●
●●
●●●●●●●●●●●●●●●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●
●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●
●●●●●●●●●
●●●●
●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●●
●●
●
●●●●●●●●●●●●●●●
●●
●
●●●●●
●●●
●●●●●●●●●●●●●●●
●●
●
●
●
●
●●
●●●●●●●
●●
1 3 5 7 9 11 13 15 17 19 21 X2 4 6 8 10 12 14 16 18 20 22
CNV gains FWER cutoff = 4.33e−5CNV gains FDR cutoff = 0.001
1q21.17q11.23
13q12.11
16p11.2
22q11.2Xq28 (2 sites)
B
28
Figure5:Manhattanplotofprobe-levelassociationsacrosstheNeurexin-1locus
Empiricalp-valuesateachdeletionbreakpointrevealasaw-toothpatternofassociation.Predominantpeakscorrespondtoexons
andtranscriptionalstartsitesofNRXN1isoforms.
DEL resid-Z pval
UCSC Genes (RefSeq, GenBank, tRNAs & Comparative Genomics)
Deletions, cases (PLINK CNV track)
Deletions, controls (PLINK CNV track)
29
Locus CNVtype Geneorregionname
InitialSCZassociationreference(seelegend)
TestedinReesetal.2014
Reportedp-value
SCZCNVcarrier%
ControlCNVcarrier%
ReportedOddsRatio
22q11.2 deletion multigenic 1 yes 4.40E-40 0.29 0 Inf
16p11.2 duplication proximalduplication 2 yes 2.90E-24 0.35 0.03 11.52
1q21.1 deletion multigenic 3,4 yes 4.10E-13 0.17 0.021 8.35
2p16.3 deletion NRXN1exons 5,6 yes 1.30E-11 0.18 0.02 9.01
15q11.2 deletion multigenic 3 yes 2.50E-10 0.59 0.28 2.15
3q29 deletion multigenic 7,11 yes 1.50E-09 0.082 0.0014 57.65
15q13.2-13.3 deletion multigenic 3,4 yes 5.60E-06 0.14 0.019 7.52
15q11.2-13.1 duplication AS/PWS 8 yes 5.60E-06 0.083 0.0063 13.2
8q11.23 duplication RB1CC1 9 no 1.29E-05 0.106 0.014 8.58
16p13.11 duplication multigenic 8 yes 5.70E-05 0.31 0.13 2.3
7q11.23 duplication Williams-Beuren 10 yes 6.90E-05 0.066 0.0058 11.35
1q21.1 duplication multigenic 11 yes 9.90E-05 0.13 0.037 3.45
16p13.2 duplication C16orf72/USP7 11 no 1.00E-04 0.254 0.0197 12.9
1p36.33 duplication multigenic 12 no 5.00E-04 0.065 0.0075 8.66
22q11.2 duplication multigenic 13 no 8.60E-04 0.014 0.085 0.17
17p12 deletion HNPP 14 yes 1.20E-03 0.094 0.026 3.62
9q34.3 duplication intergenic 15 no 1.40E-03 1.47 0.43 3.38
16p12.1 deletion multigenic 12 no 1.60E-03 0.15 0.057 2.72
15q21.3 duplication CGNL1 12 no 1.90E-03 0.32 0.19 1.71
11q25 deletion GLB1L3/GLB1L2 11 no 3.00E-03 0.38 0.123 3
2q37.3 duplication AQP12A/KIF1A 12 no 3.00E-03 0.34 0.24 1.43
17q12 deletion RCAD 16 yes 0.0072 0.036 0.0054 6.64
9p24.2 deletion GLIS3 12 no 8.40E-03 0.033 0 Inf
30
9p24.2 deletion SLC1A1 12 no 9.80E-03 0.047 0.0075 6.19
16p11.2 deletion distaldeletion 17 yes 0.017 0.063 0.018 3.39
7q36.3 duplication WDR60/VIPR2 11,18 yes 0.27 0.11 0.069 1.54
ExtendedDataTable1:PreviouslyreportedCNVassociation
Weassembledalistof26reportedCNVassociationstoSCZ,whereanoddsratioandp-valuewereavailable.AteachCNVlocus,welisttheoddsratiosandp-valuesfromthelargestsamplecollectionavailableintheliterature.ResultsfromallCNVlocimeta-
analyzedinReesetal.(2014),whenavailable,wereused.Throughoutthisarticlewerefertothisentirelistas“previouslyreported”
loci.Reportedp-valuesforninelocishowninboldsurpassthemultipletestingthresholddrawnfromthecurrentdatasetusinga
Cochran-MantelHaenszelteststratifiedbygenotypingplatform.Throughoutthisarticlewerefertotheseninelocias“previously
implicated”.
31
ExtendedDataTables2-4areseparate.xlsxsheetsavailableuponrequest.
EDTable2datasets.xlsx–datasetsandsamplesizesusedinthecurrentstudy
EDTable3NeuroGeneSsets.xlsx–Genesetsinvestigatedinthecurrentstudy
EDTable4stepdown_withLegend.xlsx–Uniquecontributionofsignificantgenesetsinstep-downregressionmodels
32
Locus CNVtype Geneorregion Reference(seeEDtable1)
Z-scorep-value
CMHp-value CNVcarriers SCZCase
countControlcount
CMHtestOddsRatio
N=41321 N=21094 N=20227
22q11.21 Deletion Multigenic 1 6.20E-13 4.40E-13 58 58 0 NA 16p11.2 Duplication Proximalduplication 2 2.60E-10 9.40E-12 67 63 4 13.8 15q13.2-13.3 Deletion Multigenic 3,4 6.10E-06 2.50E-06 33 30 3 10.55 3q29 Deletion Multigenic 7 6.20E-05 1.60E-04 16 16 0 NA 2p16.3 Deletion NRXN1 5,6 9.40E-05 3.00E-04 27 23 4 5.87 16p11.2 Deletion Distaldeletion 17 1.00E-04 5.20E-03 12 11 1 12.68 22q11.21 Duplication Multigenic 12 1.60E-04 6.30E-04 27 4 23 0.18 1q21.1 Deletion Multigenic 3,4 2.90E-04 2.70E-05 39 33 6 5.42 16p13.2 Duplication C16orf72/USP7 11 3.80E-04 1.10E-04 27 24 3 9.02 7q11.23 Duplication Williams-Beuren 10 4.90E-04 1.70E-03 13 13 0 NA 15q11.2-13.1 Duplication AS/PWS 8 6.60E-04 4.40E-04 15 15 0 NA 8q11.23 Duplication FAM150/RB1CC1 9 9.20E-04 5.10E-04 14 14 0 NA 15q11.2 Deletion Multigenic 3 1.70E-03 1.30E-03 142 95 47 1.8 1q21.1 Duplication Multigenic 11 2.00E-03 0.02 21 19 2 6.28 16q22.1 Duplication WWP2 19 0.003 0.08 5 5 0 NA 7q36.3 Duplication WDR60/VIPR2 11,18 4.10E-03 2.40E-03 14 13 1 12.12 17q12 Duplication RCADduplication 19 0.009 0.02 20 16 4 3.81 9q33.1 Deletion NA 19 0.02 0.09 11 9 2 4.02 22q11.23 Duplication Multigenic 19 0.02 0.03 20 15 5 3.28 5q21.2 Deletion NA 19 0.03 0.07 32 22 10 2.16 8p22 Duplication SGCZ 19 0.03 0.08 5 5 0 NA 9p24.2 Deletion SLC1A1 12 0.03 0.02 8 8 0 NA 16p12.1 Deletion Multigenic 12 0.03 0.006 33 26 7 3.22 15q21.3 Duplication CGNL1 12 0.04 1.30E-03 103 69 34 1.99 17q12 Deletion RCAD 16 0.04 0.13 4 4 0 NA 16p13.11 Del/Dup Multigenic 8 0.08 0.03 139 84 55 1.49 7q11.21 Duplication NA 19 0.09 0.35 64 26 38 0.76 12q23.1 Duplication ANKS1B/UHRF1BP1L 19 0.1 0.73 28 16 12 1.23 1p36.33 Duplication Multigenic 12 0.11 0.06 15 12 3 3.98 5q33.1 Deletion NA 12 0.11 0.1 11 9 2 4.19 9q21.33 Duplication AGTPBP1 11 0.2 0.21 22 15 7 1.94 9q34.3 Duplication C9orf62 15 0.23 0.03 409 190 219 0.8 6q24.2 Duplication PHACTR2 12 0.26 0.15 10 8 2 4.03 3q26.1 Deletion NA 11 0.27 0.43 5 4 1 3.5 4q35.2 Deletion TRIML1/TRIML2 12 0.35 0.21 26 17 9 1.82 18q21.31 Duplication NEDD4L 11 0.39 0.7 2 2 0 NA 11q25 Deletion GLB1L3/GLB1L2 11 0.42 0.22 58 34 24 1.44 9p24.2 Deletion GLIS3 12 0.43 0.99 10 5 5 0.99 18q23 Duplication GALR1 12 0.57 0.81 7 4 3 1.22
33
4q35.2 Duplication FAM149A/CYP4V2 12 0.69 0.77 12 5 7 0.71 2q37.2 Duplication AQP12A/KIF1A 12 0.72 0.14 125 72 53 1.34 17p12 Deletion HNPP 14 0.82 0.89 22 12 10 1.06 4q25 Duplication ELOVL6 12 0.9 0.99 13 7 6 1 10q11.21 Duplication LikelycommonCNV 19 NA NA NA NA NA NA
ExtendeddataTable5:CNVprobe-levelresults–PreviouslyreportedCNVs
Probe-levelassociationresultsforallpreviouslyreportedCNVsfromgenome-widescansofSCZ.Wereportassociationresultsfrom
theSCZresidualphenotypeandfromaCMHteststratifiedbygenotypingplatform.CNVlociinboldmakeuppreviouslyimplicated
loci,inwhichthemostrecentpublishedp-valuesurpassedgenome-widecorrection.
34
Extendeddatatablereferences1. Shprintzen,R.J.,Goldberg,R.,Golding-Kushner,K.J.&Marion,R.W.Late-onsetpsychosis
inthevelo-cardio-facialsyndrome.AmJMedGenet42,141-2(1992).
2. McCarthy,S.E.etal.Microduplicationsof16p11.2areassociatedwithschizophrenia.NatGenet41,1223-7(2009).
3. Stefansson,H.etal.Largerecurrentmicrodeletionsassociatedwithschizophrenia.Nature(2008).
4. TheInternationalSchizophrenia,C.Rarechromosomaldeletionsandduplicationsincreaseriskofschizophrenia.Nature.455,237-241(2008).
5. Kirov,G.etal.ComparativegenomehybridizationsuggestsaroleforNRXN1andAPBA2inschizophrenia.HumMolGenet17,458-65(2008).
6. Rujescu,D.etal.Disruptionoftheneurexin1geneisassociatedwithschizophrenia.HumMolGenet(2008).
7. Mulle,J.G.etal.Microdeletionsof3q29conferhighriskforschizophrenia.AmJHum
Genet87,229-36(2010).
8. Ingason,A.etal.Maternallyderivedmicroduplicationsat15q11-q13:implicationofimprintedgenesinpsychoticillness.AmJPsychiatry168,408-17(2011).
9. Degenhardt,F.etal.DuplicationsinRB1CC1areassociatedwithschizophrenia;identificationinlargeEuropeansamplesets.TranslPsychiatry3,e326(2013).
10. Mulle,J.G.etal.Reciprocalduplicationofthewilliams-beurensyndromedeletiononchromosome7q11.23isassociatedwithschizophrenia.BiolPsychiatry75,371-7(2014).
11. Levinson,D.F.etal.Copynumbervariantsinschizophrenia:confirmationoffivepreviousfindingsandnewevidencefor3q29microdeletionsandVIPR2duplications.Am
JPsychiatry168,302-16(2011).
12. Rees,E.etal.Analysisofcopynumbervariationsat15schizophrenia-associatedloci.BrJPsychiatry204,108-14(2014).
13. Rees,E.etal.Evidencethatduplicationsof22q11.2protectagainstschizophrenia.Mol
Psychiatry19,37-40(2014).
14. Kirov,G.etal.Supportfortheinvolvementoflargecnvsinthepathogenesisofschizophrenia.HumMolGenet(2009).
15. Bergen,S.E.etal.Genome-wideassociationstudyinaSwedishpopulationyieldssupportforgreaterCNVandMHCinvolvementinschizophreniacomparedtobipolardisorder.MolecularPsychiatry17,880-6(2012).
16. Moreno-De-Luca,D.etal.Deletion17q12isarecurrentcopynumbervariantthatconfershighriskofautismandschizophrenia.AmJHumGenet87,618-30(2010).
35
17. Guha,S.etal.Implicationofararedeletionatdistal16p11.2inschizophrenia.JAMA
psychiatry70,253-60(2013).
18. Vacic,V.etal.DuplicationsoftheneuropeptidereceptorgeneVIPR2confersignificantriskforschizophrenia.Nature471,499-503(2011).
19. Szatkiewicz,J.etal.CopynumbervariationinschizophreniainSweden.Molecular
Psychiatry19,762-73(2014).
37
ExtendedDataFigure2:CNVburdenforlosses:Fig3A:ForestplotofCNVburden(genesaffected)partitionedbygenotypingplatform.Fig3B:CNVburdenpartitionedbyCNVfrequency.
●
●
●
●
●
●
●
●
Odds Ratio (95% CI)
1 2 3 4
●
●
●
●
●
●
●
PGC
omni
A6.0
I550
A5.0
I600
I300
A500
21094
9795
7027
1572
1130
818
410
314
20227
8422
7980
1705
1001
623
285
201
0.9
0.9
0.8
0.9
0.6
0.7
0.8
1.1
4.0e−16
2.1e−09
1.9e−04
0.073
1.9e−04
0.1
0.237
0.306
1.4
1.53
1.26
1.38
1.89
1.64
1.46
1.63
Platform Cases Controls Genes pval OR
AffymetrixIllumina
A
singleton 2−5 6−10 11−20 21−40 41−80 81+
CNV count
Odd
s Ra
tio (9
5% C
I)
12
34
all CNVpreviously implicated CNV removed
B
38
ExtendedDataFigure3:CNVburdenforgains:Fig4A:ForestplotofCNVburden(genesaffected)partitionedbygenotypingplatform.Fig4B:CNVburdenpartitionedbyCNVfrequency.
●
●
●
●
●
●
●
●
Odds Ratio (95% CI)
0.5 1.0 1.5 2.0 2.5 3.0
●
●
●
●
●
●
●
PGC
omni
A6.0
I550
A5.0
I600
I300
A500
21094
9795
7027
1572
1130
818
410
314
20227
8422
7980
1705
1001
623
285
201
2.1
2.1
2.2
1.9
1.8
1.4
2.4
1.6
2e−07
0.001
0.014
0.047
0.013
0.034
0.772
0.112
1.12
1.15
1.08
1.22
1.24
1.38
0.93
1.66
Platform Cases Controls Genes pval OR
AffymetrixIllumina
A
singleton 2−5 6−10 11−20 21−40 41−80 81+
CNV count
Odd
s Ra
tio (9
5% C
I)
0.8
1.0
1.2
1.4
1.6
1.8
all CNVpreviously implicated CNV removed
B
39
ExtendedDataFigure4:CNVprobelevelManhattanplot:Manhattanplotofprobe-
levelassociationresultsfromtheSCZresidualphenotype.Fig5A:CNVlossesFig5B:CNV
gains.Genome-widecorrectionwasdeterminedusingthefamily-wiseerrorrate(FWER)
drawnfrompermutation.
40
ExtendedDataFigure5:PermutationofNAHR-mediatedCNVs:Permutationresults
fromdrawingfrequency-matchCNVlociandtestingforfractionofNAHR-mediated
CNVs.TotestfortheenrichmentofNAHR-mediatedlociinoursuggestiveresultsfrom
thegene-basedtest,eachpermutationselectedanequivalentnumberofindependent
CNVlociandtestedthefactionofNAHR-mediatedCNVs.
p=0.008
0
500
1000
0.00 0.25 0.50 0.75 1.00fraction of NAHR−mediated CNVs
frequ
ency
Gains + Losses
41
ExtendedDataFigure6:Xq28CNVhotspot:Fig6A:ProtectiveCNVgainassociation
peakaroundtheMAGEA11andTMEM185Agene,bothwithinanintronoftheHSFX1
gene.Fig6B:RiskCNVgainassociationpeakatthedistalendofXq28overlappingten
genes.
42
ExtendedDataFigure7:SCZphenotyperesidualdistribution:X-axis:Distributionof
phenotyperesidualvaluesafterregressingcase/controlstatusonselectedcovariates.
PlottedagainstoverallCNVKbburden(Y-axis)tovisuallyinspectifindividualswithlarge
residualshaveanexcessofCNVburden,whichcanleadtohigherfalsepositive
associations.SCZcaseshavepositiveresidualvaluesandcontrolsnegativeresidual
values.
43
ExtendedDataFigure8:DetectionpowerforCNVlosses:Poweristheproportionof
simulatedcausalCNVlocidetected(e.g.surpassinggenome-wideFWERcorrection)
usingprobe-levelassociation.EachgraphplotspoweracrossvariousMAF(x-axis)and
SNP−level detection power
MAF
surp
assi
ng g
enom
e−w
ide
cuto
ff
1.0e−04 0.0025 0.005 0.01
0.0
0.2
0.4
0.6
0.8
1.0
●
●
●
●
●
●
●
●
●
GRR1.5235102050100Inf
MAF
surp
assi
ng g
enom
e−w
ide
cuto
ff
1.0e−04 2.5e−04 5.0e−04 7.5e−04 0.001
0.0
0.2
0.4
0.6
0.8
1.0
44
genotyperelativerisk(GRR:coloredlines).SimulationsusethesamplesizeandFWER
cutofffromthecurrentstudy.
45
Methods
Overview
WeassembledaCNVanalysisgroupwithmembersfromBroadInstitute,Children’s
HospitalofPhiladelphia,UniversityofChicago,UniversityofCaliforniaSanDiego,
UniversityofMichigan,UniversityofNorthCarolina,ColoradoUniversityBoulder,and
UniversityofToronto/SickKidsHospital.Ouraimwastoleveragetheextensiveexpertise
ofthegrouptodevelopafullyautomatedcentralizedpipelineforconsistentand
systematiccallingofCNVsforbothAffymetrixandIlluminaplatforms.Anoverviewof
theanalysispipelineisshowninExtendedDataFigure1.Afteraninitialdataformatting
stepweconstructedbatchesofsamplesforprocessingusingfourdifferentmethods,
PennCNV,iPattern,C-score(GADAandHMMSeg)andBirdsuiteforAffymetrix6.0.For
Affymetrix5.0dataweusedBirdsuiteandPennCNV,forAffymetrix500weused
PennCNVandC-score,andforallIlluminaarraysweusedPennCNVandiPattern.We
thenconstructedaconsensusCNVcalldatasetbymergingdataatthesampleleveland
furtherfilteredcallstomakeafinaldatasetExtendeddatatable2.Priortoanyfiltering,
weprocessedrawgenotypecallsforatotalof57,577individuals,including28,684SCZ
casesand28,893controls.
StudySample
Acompletelistofdatasetsthatwereincludedinthecurrentstudycanbefoundin
ExtendedDataTable2.Amoredetaileddescriptionoftheoriginalstudiescanbefound
inapreviouspublication1
CopyNumberVariantAnalysisPipelineArchitectureandSampleProcessing
AllaspectsoftheCNVanalysispipelinewerebuiltontheGeneticClusterComputer
(GCC)intheNetherlands.PGCmemberssentexternaldrivesofrawdatatothe
Netherlandsforuploadtotheserveraswellasthecorrespondingsamplemetadata
files.
46
InputAcceptanceandPreprocessing:ForAffymetrixweusedthe*.CELfiles(all
convertedtothesameformat)asinput,whereasforIlluminawerequiredGenomeor
Beadstudioexported*.txtfileswiththefollowingvalues:SampleID,SNPName,Chr,
Position,Allele1–Forward,Allele2–Forward,X,Y,BAlleleFreqandLogRRatio.
Sampleswerethenpartitionedinto‘batches’toberunthrougheachpipeline.For
AffymetrixsampleswecreatedanalysisbatchesbasedontheplateID(ifavailable)or
genotypingdate.Eachbatchhadapproximately200sampleswithanequalmixofmale
andfemalesamples.AffymetrixPowerTools(APT-apt-copynumber-workflow)was
thenusedtocalculatesummarystatisticsaboutchipsanalyzed.Gendermismatches
identifiedandexcludedaswereexperimentswithMAPD>0.4.ForIlluminadata,we
firstdeterminedthegenomebuildandconvertedtohg18ifnecessaryandcreated
analysisbatchesbasedontheplateIDorgenotypingdate.Eachbatchhad
approximately200samples,andequalmixofmaleandfemalesamples.
CompositePipeline:ThecompositepipelinecomprisesCNVcallersPennCNV2,iPattern3,
Birdsuite4andC-Score5organizedintocomponentpipelines.Weusedallfourcallers
forAffymetrix6.0data,PennCNVandC-ScoreforAffymetrix500,Probeannotationfiles
werepreprocessedforeachplatform.Oncethearraydesignfilesandprobeannotation
fileswerepre-processed,eachindividualpipelinecomponentpipelinewasrunintwo
steps:1)processingtheintensitydatabythecorepipelineprocesstoproduceCNVcalls,
2)parsingthespecificoutputformatofthecorepipelineandconvertingthecallstoa
standardformdesignedtocaptureconfidencescores,copynumberstatesandother
informationcomputedbyeachpipeline
MergingofCNVdataandQualitycontrolfiltering
MergingofCNVdata:Afterstandardizationofoutputsfromeachalgorithm,CNVcalls
fromeachalgorithmweremergedatthesampleleveltoincreasespecificity3.ForCNVs
generatedfromAffymetrix6.0array,wetooktheintersectionofthefouroutputs
47
(Birdsuite,iPattern,C-Score,PennCNV)atthesampleleveltocreateaconsensusCNV.
FortheAffymetrix500,Affymetrix5.0,andIlluminaplatforms,CNVmergingwas
performedbytakingtheintersectionofthecallsmadebythetwoalgorithms(PennCNV
andC-ScoreforAffymetrix500,BirdsuiteandPennCNVforAffymetrix5.0,andiPattern
andPennCNVforIllumina)atthesamplelevel.CNVcallsthatweremadebyonlyoneof
thealgorithmwereexcluded.CallsdiscordantfortypeofCNV(gainorloss)werealso
excluded.
Qualitycontrolfiltering:Followingmergingweappliedfilteringcriteriaforremovalof
arrayswithexcessiveprobevarianceorGCbiasandremovalofsampleswith
mismatchesingenderorethnicityorchromosomalaneuploidies.ForAffymetrixdata,
weextractedtheMAPDandwaviness-sdfromtheAPTsummaryfile.Wealsocalculated
theproportionofeachchromosome(excludingchrY)taggedascopynumbervariable
andcomputedthenumberofCNVcallsmadeforeachsample.Wethenretained
experimentsifeachofthesemeasureswaswithin3SDofthemedian.ForIlluminadata,
weextractedLRRSD,BAFSD,GCWF(waviness)fromPennCNVlogfiles.Aswiththe
Affymetrixdata,wecalculatedtheproportionofeachchromosome(excludingchrY)
taggedascopynumbervariableandcomputedthenumberofCNVcallsmadeforeach
sample.Weretainedsamplesifeachoftheabovemeasureswaswithin3SDofthe
median.ForbothIlluminaandAffymetrixdatasets,largeCNVsthatappearedartificially
splitwerecombinedtogetherifoneofthemethodsdetectedaCNVspanningthegap.
However,sampleswhere>10%ofthechromosomewascopynumbervariablewere
excludedaspossibleaneuploidies.Further,weexcludedCNVsthat:1)spannedthe
centromereoroverlappedthetelomere(100kbfromtheendsofthechromosome);2)
had>50%ofitslengthoverlappingasegmentalduplication;3)had>50%overlapwith
immunoglobulinorTcellreceptor.ThefinalfilteredCNVdatasetwasannotatedwith
Refseqgenes(transcriptionsandexons).Afterthisstageofqualitycontrol(QC),wehad
atotalof52,511individuals,with27,034SCZcasesand25,448controls.
48
FilteringforrareCNVs:TomakeourfinaldatasetofrareCNVsforallsubsequent
analysisweuniversallyfilteredoutvariantsthatpresentat>=1%(50%reciprocal
overlap)frequencyincasesandcontrolscombined.CNVsthatoverlapped>50%with
regionstaggedascopynumberpolymorphiconanyotherplatformwerealsoexcluded.
CNVs<20kborhavingfewerthan10probeswerealsoexcluded.
Post-CNVCallingQC
Overview:AnumberofstepswereundertakenafterCNVcallingandinitialfilteringQC
tominimizetheimpactoftechnicalartifactsandpotentialconfounds.Insummary,we
removedindividualsnotpresentinthePGC2GWASanalysis1,removeddatasetswith
non-matchingcaseorcontrolsamplesthatcouldnotbereconciledusingconsensus
platformprobes,andremovedanyadditionaloutlierswithrespecttooverallCNV
burden,CNVcallingmetrics,orSCZphenotyperesiduals.Allstepsaredescribedinmore
detailbelow.
MergingwithGWAScohort:Bymatchingtheuniquesampleidentifiers,weretained
onlyindividualsthatalsopassedQCfilteringfromthecompanionPGCGWASstudyin
Schizophrenia1.Thisstepfilteredoutsampleswithlow-qualitySNPgenotyping,related
individuals,andrepeatedsamplesacrosscohorts.AnadditionalbenefitofthePGC
analyticalframeworkistheabilitytoaccountforpopulationstratificationacrosscohorts
usingprincipalcomponentsderivedfromprobelevelanalysis.Afterthepost-CNVcalling
qualitycontrolstepsdescribedbelow,were-calculatedprincipalcomponentsusingthe
Eigenstratsoftwarepackage6.SampleinformationandsubsequentCNVandGWAS
filteredsamplesetsarepresentedinExtendeddatatable2.Intheprocessofmatching
totheGWAS-specificcohort,allindividualsofnon-Europeanancestrywereremoved
fromanalysis(~5.8%ofthepost-QCsamplecomprisingthreeseparatedatasets).We
alsoremoved42samplesthathaddiscordantphenotypedesignationsbetweenthe
GWASanalysisandCNVgenotypesubmission.
49
Individualdatasetremoval:SomedatasetssubmittedtothePGCconsistedofonlycase
orcontrolsamples,affectedtrios,orrecruitedexternalsamplesascontrols.This
asymmetryincase-controlascertainmentandgenotypingcanpresentseriousbiasesfor
CNVanalysis,asthesensitivitytodetectCNVwillvaryconsiderablyacrossgenotyping
platforms,aswellaswithindatasetandgenotypingbatch.Unlikeimputationprotocols
commonlyusedforSNPgenotyping,thereisnoequivalentprocesstoinferunmeasured
probeintensityfromnearbymarkers.Wetookanumberofstepstoidentifyand
removedatasetsthatshowedstrongsignsofcase-controlascertainmentorgenotyping
asymmetry:
1)Identifygenotypingplatformswherecase-controlratiowasnotbetween40-60%
2)Wherepossible,mergesimilargenotypingplatformsusingconsensusprobespriorto
CNV-callingpipelineinordertoimprovecase-controlratio.
3)ExamineoverallCNVburdenandassociationpeaksforspuriousresults
4)RemovedatasetsthatremainproblematicduetounusualCNVburdenormultiple
spuriousCNVassociations.
ThegenotypingplatformsidentifiedandprocessedarelistedinExtendeddatatable2.
WewereabletocombinetheIlluminaOmniExpressandIlluminaOmniExpressplus
ExomeChipplatformswithsuccessbyremovingprobecontentspecifictotheExome
chipplatform.WeremovedthecawsAffymetrix500datasetsduetoanumberofstrong
CNVassociationpeaksnotseeninanyotherdataset.Wealsoremovethefii6dataset
duetoa2-foldCNVburdenincasesrelativetocontrols.Inordertoimprovecase-
controlbalance,wehadtoremovetheaffectedprobandtriodatasets(boco,lacw,and
lemu)intheIllumina610platform,andthecontrol-onlyuclodatasetintheAffymetrix
500platform.
Individualsampleremoval:Were-analyzedCNVburdenestimatesinthereduced
sampletoflaganylingeringoutliersmissedintheinitialQC.Weidentifiedoutliersfor
50
CNVcountandKbburdenintheautosome(>30CNVsor8Mb,respectively)andinthe
Xchromosome(>10CNVsor5Mb,respectively),removinganadditional15individuals.
Genome-wideCNVintensityandqualitymeasurementsproducedbyCNVcalling
algorithms(i.e.“CNVmetrics”)wereexaminedforadditionaloutliersandpotential
relationshipswithcase-controlstatus.EachCNVmetricwasre-examinedacrossstudies
toassessifanyadditionaloutlierswerepresent.Onlythreeoutlierswereremovedas
theirmeanBallele(orminorallele)frequencydeviatedsignificantlyfrom0.5.ManyCNV
metricsareauto-correlated,astheymeasuresimilarpatternsofvariationintheprobe
intensity.Thus,wefocusedonthemainintensitymetrics-medianabsolutepairwise
difference(MAPD)forprojectsgenotypedontheAffymetrix6.0platform,andLogR
Ratiostandarddeviation(LRRSD)inallothergenotypingplatforms.AmongAffymetrix
6.0datasets,MAPDdidnotdifferbetweenincasesandcontrols(t=1.14,p=0.25).
However,amongnon-Affymetrix6.0datasets,LRRSDshowedsignificantdifferences
betweencasesandcontrols(t=-35.3,p<2e-16),withcontrolshavingahigher
standardizedmeanLRRSD(0.227)thancases(-0.199).Tocontrolforanyspurious
associationsdrivenbyCNVcallingquality,weincludedLRRSD(MAPDforAffymetrix6.0
platforms)asacovariateindownstreamanalysis.CNVmetricswerenormalizedwith
theirgenotypingplatformpriortoinclusioninthecombineddataset.
Regressionofpotentialconfoundsoncase-controlascertainment
ThePGCcohortsareacombinationofmanydatasetsdrawnfromtheUSandEurope,
anditisimportanttoensurethatanybiasinsampleascertainmentdoesnotdrive
spuriousassociationtoSCZ.Inordertoensuretherobustnessoftheanalysis,we
controlledforanumberofcovariatesthatcouldpotentialconfoundresults.Burdenand
gene-setanalysesincludedcovariatesinalogisticregressionframework.Duetothe
numberoftestsrunatprobelevelassociation,weemployedastep-wiselogistic
regressionapproachtoallowfortheinclusionofcovariatesinourcase-control
association,whichwetermtheSCZresidualphenotype.
51
Covariatesincludesex,genotypingplatform,CNVmetrics,andancestryprincipal
componentsderivedfromSNPgenotypesonthesamesamplesinapreviousstudy1.We
wereunabletocontrolfordatasetorgenotypingbatch,asasubsetofthecontributing
datasetsarefullyconfoundedwithcase/controlstatus.CNVmetricisnormalizedwithin
genotypingplatformpriortoinclusioninthelogisticmodel.Onlyprincipalcomponents
thatshowedasignificantassociationtosmallCNVburdenwereused(smallCNVbeing
definedasautosomalCNVburdenwithCNV<100kbinsize).Amongthetop20
principalcomponents,onlythe1st,2nd,3rd,4th,and8thprincipalcomponentshowed
associationwithsmallCNVburden(withp<0.01usedasthesignificancecutoff).To
calculatetheSCZresidualphenotype,wefirstfitalogisticregressionmodelof
covariatestoaffectionstatus,andthenextractedthePearsonresidualvaluesforusein
aquantitativeassociationdesignfordownstreamanalyses.Residualphenotypevalues
incasesareallabovezero,andcontrolsbelowzero,andaregraphedagainstoverallkb
burdeninExtendeddatafigure7.WeremovedthreeindividualswithanSCZresidual
phenotypegreaterthanthree(ornegativethreeincontrols).Afterthepost-processing
roundofQC,weretainedadatasetwithatotalof41,321individualscomprising21,094
SCZcasesand20,227controls.
IdentifyingpreviouslyimplicatedCNVlociintheliterature
TodelineateCNVburdeneffectscomingfromCNVlocithathavepreviouslybeen
reportedasputativeSCZriskfactorsfromCNVinremainderofthegenome,weflagged
CNVlociwithp<0.01thathaveeitherbeenreviewed7,8orotherwisereported8-10as
potentialSCZriskfactorsintheliterature.Previouslyreportedlocimeetinginclusionare
listedinExtendeddatatable1.WhileanumberofCNVlocihavebeenreportedin
multiplestudies,wesoughtthemostrecentreportsthatincorporatedthelargest
samplesizes.ToidentifyputativelyassociatedCNVlociwithSCZfromthefulllist,we
appliedthegenome-widep-valuecutoffof8e-5,derivedfromtheCochran-Mantel-
Haenzel(CMH)testinthecurrentprobe-levelanalysisasthep-valuecutoffforinclusion
52
asSCZimplicatedCNVloci.WhiletheCMHtestisnottheprimaryprobe-leveltestinthe
currentPGCanalysis,itcorrespondsmorecloselytothetestsusedinpublishedreports.
Inall,nineindependentCNVlocifrompublishedreportssurpassgenome-wide
correction.AllpublishedCNVloci,eventhoseexcludedasanSCZimplicatedregions,are
examinedintheprobe-levelassociationanalysis.
CNVburdenanalysis
WeanalyzedtheoverallCNVburdeninavarietyofwaystodiscernwhichgeneral
propertiesofCNVarecontributingtoSCZrisk.OverallindividualCNVburdenwas
measuredin3distinctways–1)KbburdenofCNVs,2)Numberofgenesaffectedby
CNVs,and3)NumberofCNVs.Inparticular,weonlycountedgeneasaffectedwhenthe
CNVoverlappedacodingexon.WealsopartitionedouranalysesbyCNVtype,size,and
frequency.CNVtypeisdefinedascopynumberlosses(ordeletions),copynumbergains
(orduplications),andbothcopynumberlossesandgains.Toassignaspecificallele
frequencytoaCNV,weusedthe--cnv-freq-method2commandinPLINK,wherebythe
frequencyisdeterminedasthetotalnumberofCNVoverlappingthetargetCNV
segmentbyatleast50%.ThismethoddiffersfromothermethodsthatassignCNV
frequenciesbygenomicregion,wherebyasingleCNVspanningmultipleregionsmaybe
includedinmultiplefrequencycategories.
ForFigure1,andExtendeddatafigures2and3,wepartitionedCNVburdenby
genotypingplatform,andtheabbreviationsforeachplatformareexpandedbelow:
A500:Affymetrix500
I300:Illumina300K
I600:Illumina610KandIllumina660W
A5.0:Affymetrix5.0
A6.0:Affymetrix6.0
omni:OmniExpressandOmniExpressplusExome
53
DuetothesmallsizeoftheOmni2.5array(28casesand10controls),theywere
excludedfrompresentationinthefigure,butareincludedinallburdenanalyseswith
thetotalPGCsample.Burdentestsusealogisticregressionframeworkwiththe
inclusionofcovariatesdetailedabove.Usingalogisticregressionframework,we
predictedSCZstatususingCNVburdenasanindependentpredictorvariable,thus
allowingustogetanaccurateestimateoftheuniquecontributionofCNVburdenina
multipleregressionframework.TogaininsightintotheproportionofCNVburdenrisk
comingfromlocioutsideofthepreviouslyimplicatedSCZregions,weranallburden
analysesafterremovingCNVthatoverlappedpreviouslyimplicatedCNVboundariesby
morethan10%.
CNVprobelevelassociation
Genome-wideinterrogationofCNVsignalswastestedateachrespectiveCNV.Probe
leveltestswereexaminedatthestart,end,andsinglebasepositionaftertheendofthe
calledCNV.ThreecategoriesofCNVweretested:CNVdeletions,CNVduplications,and
deletionsandduplicationstogether.AllanalyseswererunusingPLINKsoftware11.
WeranprobelevelassociationusingtheSCZresidualphenotypeasaquantitative
variable,withsignificancedeterminedthroughpermutationofphenotyperesidual
labels.Anadditionalz-scoringcorrection,explainedbelow,isusedtocontrolforany
extremevaluesintheSCZresidualphenotypeandefficientlyestimatetwo-sided
empiricalp-valuesforhighlysignificantloci.Toensureagainstthepotentiallossof
powerfromtheinclusionofcovariates,wealsoranasingledegreeoffreedomCochran-
Mantel-Haenzel(CMH)teststratifiedbygenotypingplatform,witha2(CNVcarrier
status)x2(phenotypestatus)xN(genotypingplatform)contingencymatrix.Whilethe
CMHtestdoesnotaccountformoresubtlebiasesthatcoulddrivefalsepositivesignals,
itisrobusttosignalsdrivenbyasingleplatformandallowsforeachCNVcarriertobe
54
treatedequally.Locithesurpassedgenome-widecorrectionineithertestwasfollowed
upforfurtherevaluation.
Z-scorerecalibrationofempiricaltesting:Probelevelassociationp-valuesfromtheSCZ
residualphenotypewereinitiallyobtainedbyperformingonemillionpermutationsat
eachCNVposition,wherebyeachpermutationshufflestheSCZresidualphenotype
amongallsamples,andretainstheSCZresidualmeanforCNVcarriersandnon-carriers.
ForextremelyrareCNV,however,CNVcarriersattheextremeendsoftheSCZresidual
phenotypecanproducehighlysignificantp-values.Whileweunderstandthatsuchrare
eventsareunabletosurpassstrictgenome-widecorrection,wewantedtoretainall
teststohelpdelineatethepotentialfine-scalearchitecturewithinasingleregionof
association.Toproperlyaccountfortheincreasedvariancewhenonlyafewindividuals
aretested,weappliedanempiricalZ-scorecorrectiontotheCNVcarriermean.Inorder
togetanempiricalestimateofthevarianceforeachtest,wecalculatedthestandard
deviationofresidualphenotypemeandifferencesinCNVcarriersandnon-carriersfrom
5,000permutations.Z-scoresarecalculatedastheobservedcase-controlmean
differencedividedbytheempiricalstandarddeviation,withcorrespondingp-values
calculatedfromthestandardnormaldistribution.Concordanceoftheinitialempirical
andz-scorep-valuesareclosetounityforassociationtestswithsixormoreCNV,
whereasZ-scorep-valuesaremoreconservativeamongtestswithlessthansixCNV.
Furthermore,theZ-scoremethodnaturallyprovidesanefficientmannertoestimate
highlysignificantempiricalp-valuesthatwouldinvolvehundredsofmillionsof
permutationstoachieve.
Genome-widecorrectionformultipletests
BeyondidentifyingsignificantCNVattheprobelevel,wealsoestimatedthegenome-
widetestingspaceforrareCNVanalysis.WiththelargePGCcohortbeingcalledthrough
aconsistentpipeline,wesawanopportunitytocharacterizethenullexpectationof
segregatingandrecurrentdenovorareCNVinpopulationsofEuropeanancestry.
55
AcceptedthresholdsforsignificanceamongpublishedriskCNVhavebeenlimitedin
scope,asaccuratepopulationestimatesofrareCNVfrequencyanddistributionacross
thegenomerequirelargerepresentativesamples.
Genome-widesignificancethresholdswerecalculatedusingthe5%family-wiseerror
ratefrom5,000permutationsinboththeSCZresidualphenotypeandCMHtest.
Specifically,weselectedthe95thpercentileoftheminimump-valuesobtainedacross
permutations.Belowarethegenome-widecorrectionp-valuethresholdsdeterminedin
thismanner:
SCZresidualphenotypeFWERcorrection:
CNVlossesandgains:6.73e-6
CNVlosses:1.5e-5
CNVgains:1.35e-5
CMHtestFWERcorrection:
CNVlossesandgains:3.65e-5
CNVlosses:8.25e-5
CNVgains:7.8e-5
ThismethoddiffersslightlyfromthoseusedinLevinsonetal.9toestimatethemultiple
testcorrectionforrareCNV,howevertheirgenome-widecorrectionofp=1e-5
correspondsquitecloselytotheestimatesobservedusingtheSCZresidualphenotype.
Theobservedfamily-wisecorrectionservesasgoodapproximationoftheindependent
rareCNVsignalsfoundamongEuropeanancestrypopulationsforarray-basedCNV
capture,butassamplesizesincrease,sotoowilltheeffectivenumberoftests,
necessitatingfurtherevaluationofthemultipletestingburden.
Gene-setburdenenrichmentanalysis:gene-sets
56
Gene-setswithanaprioriexpectationofassociationtoneuropsychiatricdisorderswere
compiledbasedongeneannotations(GeneOntologyandcuratedpathwaydatabases,
downloadedJune2013)andpublishedarticlematerials(fordetails,seeExtendedData
Table3).Gene-setsbasedonbrainexpressionwerecompiledbyprocessingthe
BrainSpanRNA-seqgeneexpressiondata-set
(http://www.brainspan.org/static/download.html,downloadedSept2012).Four
roughlyequallysizedgene-sets(about4600geneseach)werederivedtorepresentfour
expressiontiers(veryhigh,medium-to-high,medium-to-low,veryloworabsent);genes
wereselectediftheypassedafixedexpressionthresholdinatleast5/508experimental
datapoints(correspondingtodifferentregionsofdonorbrains,differentdonorages
correspondingtodifferentdevelopmentalbrainstages,anddifferentdonorsexes).
Gene-setsbasedonmousephenotypeswereassembledbydownloadingMPO
(MammalianPhenotypeOntology)annotationsfromMGI(www.informatics.jax.org,
downloadedAugust2013),up-propagatingannotationsfollowingontologyrelations,
andmappingtohumanorthologsusingNCBIHomologene
(www.ncbi.nlm.nih.gov/homologene);finally,top-levelorgansystemswithfewergenes
wereaggregatedwhilestrivingtopreservebiologicalhomogeneity,sotohaveroughly
equal-sizedsets(2,600-1,300genes).Forallgene-sets,geneidentifiersintheprimary
sourceweremappedtoEntrez-geneidentifiersusingtheR/Bioconductorpackage
org.Hs.eg.db.
Gene-setburdenenrichmentanalysis:pre-processing
SubjectswererestrictedtotheoneswithatleastonerareCNV.Forcopynumbergains
andlosses,weseparatelycalculatedthefollowingsubject-leveltotals:variantnumber,
variantlengthandnumberofgenesimpacted;thesecovariatesarethenusedtomodel
globalburdenandcorrectgene-setburdentoensureitisspecific(i.e.notamere
reflectionofgenome-wideburdenwithsomestochasticdeviationduetosampling).The
subject-leveltotalnumberofgenesimpactedwasalsocalculatedforeachgene-set,
againseparatelyforgainsandlosses.Subjectswereflaggediftheycarriedatleastone
57
CNVmatchingalocuspreviouslyimplicatedinschizophrenia(seesection“Identifying
previouslyimplicatedCNVlociintheliterature”);thiswasthenusedtoanalyzedgene-
setburdenforallsubjects,orexcludingsubjectswithanalreadyimplicatedCNV.
Gene-setburdenenrichmentanalysis:statisticaltest
Foreachgene-set,wefitthefollowinglogisticregressionmodel(asimplementedbythe
Rfunctionglmofthestatspackage),wheresubjectsarestatisticalsamplingunits:
y~covariates+global+gene-set
Where:
• yisthedicotomicoutcomevariable(schizophrenia=1,control=0)
• covariatesisthesetofvariablesusedascovariatesalsointhegenome-wide
burdenandprobeassociationanalysis(sex,genotypingplatform,CNVmetric,
andCNVassociatedprincipalcomponents)
• globalisthemeasureofglobalburden;fortheresultsinthemaintext,weused
thetotalgenenumber(abbreviatedasUfromuniversegene-setcount);wealso
calculatedresultsfortotallength(abbreviatedasTL)andvariantnumberplus
variantmeanlength(abbreviatedasCNML)
• gene-setisthegene-setgenecount
Thegene-setburdenenrichmentwasassessedbyperformingachi-squaredeviancetest
(asimplementedbytheRfunctionanova.glmofthestatspackage)comparingthese
tworegressionmodels:
y~covariates+global
y~covariates+global+gene-set
Wereportedthefollowingstatistics:
• coefficientbetaestimate(abbreviatedasCoeff)
• t-studentdistribution-basedcoefficientsignificancep-value(asimplementedby
theRfunctionsummary.glmofthestatspackage,abbreviatedasPvalue_glm)
• deviancetestp-value(abbreviatedasPvalue_dev)
• gene-setsize(i.e.numberofgenesisthegene-set,regardlessofCNVdata)
58
• BH-FDR(Benjamini-HochbergFalseDiscoveryrate)
• percentageofschizophreniaandcontrolsubjectswithatleast1gene,2genes,
etc…impactedbyaCNVofthedesiredtype(lossorgain)inthegene-set
(abbreviatedasSZ_g1n,SZ_g2n,…CT_g1n,…)
Pleasenotethat,byperformingsimplesimulationanalyses,werealizedthatPvalue_glm
canbeextremelyover-conservativeinpresenceofveryfewgene-setcountsdifferent
than0,whilePvalue_devtendstobeslightlyunder-conservative.Whilethetwop-
valuestendtoagreewellforgene-setanalysis,Pvalue_glmissystematicallyover-
conservativeforgeneanalysissincesmallercountsaretypicallyavailableforsingle
genes.
Geneburdenanalysis:pre-processing
SubjectswererestrictedtotheoneswithatleastonerareCNV.Onlygeneswithatleast
aminimumnumberofsubjectsimpactedbyCNVweretested;thisthresholdwaspicked
bycomparingtheBH-FDRtothepermutation-basedFDRandensuringlimitedFDR
inflation(permutedFDR<1.65*BH-FDRatBH-FDRthreshold=5%)whilemaximizing
power.Forgainsthethresholdwassetto12counts,whileforlossesitwassetto8
counts.
Geneburdenanalysis:statisticaltest
Foreachgene,wefitthefollowinglogisticregressionmodel(asimplementedbytheR
functionglmofthestatspackage),wheresubjectsarestatisticalsamplingunits:
y~covariates+gene
Where:
• yisthedichotomousoutcomevariable(schizophrenia=1,control=0)
• covariatesisthesetofvariablesusedascovariatesalsointhegenome-wide
burdenandprobeassociationanalysis(sex,genotypingplatform,CNVmetric,
andCNVassociatedprincipalcomponents)
59
• geneisthebinaryindicatorforthesubjecthavingornothavingaCNVofthe
desiredtype(lossorgain)mappedtothegene
Thegeneburdenwasassessedbyperformingachi-squaredeviancetest(as
implementedbytheRfunctionanova.glmofthestatspackage)comparingthesetwo
regressionmodels:
• y~covariates
• y~covariates+gene
Geneburdenanalysis:multipletestcorrection
Multipletestcorrectionwasperformedforlociratherthanforgenes,toavoidthe
strongcorrelationbetweentestintroducedbymulti-genicCNVs;forthesamereason,it
ismoreusefultocountfalsepositivesaslociratherthangenes.Wefollowedagreedy
step-downprocedure:
• startfromgenewithmostsignificantdeviancep-valueG1,createlocusL1
• removefromthegenelistallgenesthatshareatleast50%oftheircarrier
subjectswithG1,andaddthemtolocusL1
• dothesameforthenextgenemostsignificantgeneinthelist(thuscreatinga
newlocusL2),andproceedrecursivelyuntilthereisnogeneleft
• definelocusp-valueasthesmallestdeviancep-valueofitsgenes
Wecomputedpermutation-basedFDRbypermutingsubjects’conditionlabels
(schizophrenia,control),butnotcovariates(asthoseareexpectedtocorrelatetoCNV
distribution),1,000times.TheFDRwasthendefinedastheratiobetweentheaverage
numberoftestspassingagivenp-valuethresholdacrossthe1,000permutationsand
thenumberoftestspassingthesamep-valuethresholdforrealdata.FDRswerealso
generatedcountingonlythesubsetofgeneswithpositiveandnegativeregression
coefficients(i.e.riskandpresumedprotective).Thep-valuethresholdforpermutation-
basedFDRcalculationwaspickedbychoosingthemaximumnominalp-value
correspondingtoagivenBH-FDRthreshold(e.g.5%).BH-FDRissupposedtobeslightly
inflatedbecause(i)thedeviancetestp-valueisslightlyunder-conservativeinpresence
60
ofveryfewgeneindicatorsdifferentthan0,(ii)weusethesmallestgenep-valueto
definethelocusp-value.
MethodsReferences
1. SchizophreniaWorkingGroupofthePsychiatricGenomics,C.Biologicalinsightsfrom108schizophrenia-associatedgeneticloci.Nature511,421-7(2014).
2. Wang,K.etal.PennCNV:anintegratedhiddenMarkovmodeldesignedforhigh-resolutioncopynumbervariationdetectioninwhole-genomeSNPgenotypingdata.GenomeRes17,1665-74(2007).
3. Pinto,D.etal.Functionalimpactofglobalrarecopynumbervariationinautismspectrumdisorders.Nature466,368-72(2010).
4. Korn,J.M.etal.IntegratedgenotypecallingandassociationanalysisofSNPs,commoncopynumberpolymorphismsandrareCNVs.Nat.Genet.40,1253-1260(2008).
5. McCarthy,S.E.etal.Microduplicationsof16p11.2areassociatedwithschizophrenia.NatGenet41,1223-7(2009).
6. Price,A.L.etal.Principalcomponentsanalysiscorrectsforstratificationingenome-wideassociationstudies.NatGenet38,904-909(2006).
7. Malhotra,D.&Sebat,J.CNVs:harbingersofararevariantrevolutioninpsychiatricgenetics.Cell148,1223-41(2012).
8. Rees,E.etal.Analysisofcopynumbervariationsat15schizophrenia-associatedloci.BrJPsychiatry204,108-14(2014).
9. Levinson,D.F.etal.Copynumbervariantsinschizophrenia:confirmationoffivepreviousfindingsandnewevidencefor3q29microdeletionsandVIPR2duplications.AmJPsychiatry168,302-16(2011).
10. Bergen,S.E.etal.Genome-wideassociationstudyinaSwedishpopulationyieldssupportforgreaterCNVandMHCinvolvementinschizophreniacomparedtobipolardisorder.MolecularPsychiatry17,880-6(2012).
11. Purcell,S.etal.PLINK:atoolsetforwhole-genomeassociationandpopulation-basedlinkageanalysis.AmericanJournalofHumanGenetics81,559-75(2007).
61
PGCSchizophreniaCNVanalysis–SupplementaryInformation
SupplementaryResults:
• CNVburdenbetweensexes
• Probelevelpoweranalysis
• Gene-basednetworkanalysis
• FollowupofsignificantCNVloci
• ProportionofvarianceinSCZexplainedbytopCNVloci
• NAHRenrichmentinsignificantnovelgeneloci
ConsortiumMembership
Acknowledgements
62
SupplementaryResults
CNVburdenbetweensexes
Followingrecentevidencethatostensiblyhealthyfemalescarryanincreasedburdenof
rareCNVs1,weexaminedwhetherthisincreasedfemaleburdenexistedinthecurrent
PGCdataset.WeusedalogisticregressionmodelpredictingsexusingCNVburdenand
controllingforstudycovariates,aswellastheWilcoxonrank-sumtestcomparingmale
tofemaleCNVburden1.Focusingonthesignificantfindingsinthepreviouspaper,we
examinedtheburdeninautosomalCNVcountandgenesaffectedamongPGCcontrols
(9856malesand10371females).WedoseeanelevatedCNVcountincontrolfemales
(1.90autosomalCNVrate)tomales(1.87autosomalCNVrate),howeverthisdifference
isnotsignificantineithertheregressionmodel(OR=1.004,p=0.66)ortheWilcoxon
rank-sumtest(p=0.1).Wedo,however,observeamarginallysignificantenrichment
whenfocusingonCNVlosscount,wherecontrolfemales(0.99autosomalCNVlossrate)
showahigherburdenthancontrolmales(0.94autosomalCNVlossrate;logistic
regressionOR=1.03,p=0.05;Wilcoxonrank-sumtestp=3e-3).Nosinglegenotyping
platformseemedtodrivetheenrichmentinfemales(datanotshown),andwedon’t
observeanydifferenceinCNVcountwhenlookingatCNVgains(logisticregressionOR=
0.98,p=0.18;Wilcoxonrank-sumtestp=0.56).Finally,nosignificantdifferences
betweensexeswerefoundusingeithertestwhenexaminingthenumberofgenes
affected,orwhenweincludeSCZcasesandcontrols(allp>.05).
Probelevelpoweranalysis
ByrestrictinganalysistorareCNVinthepopulation(MAF<0.01),manylocidonothave
enoughCNVtosurpassgenome-widecorrectionformultipletesting,prompting
pathwayandgenelevelanalysestoachievesufficientstatisticalpower.Touseaspecific
example,the3q29deletionisfullypenetrantinthecurrentsample,with16SCZcarriers
and0controls(MAF=3.8e-4)atthepeakofassociation.Assumingnoplatformbias,this
63
leadstoanuncorrectedchi-squarep-valueof8.9e-5,andapermutedp-valueof6.2e-5
testingassociationusingSCZphenotyperesiduals.Neitherp-value,however,surpasses
theirrespectivegenome-widesignificancecutoffforCNVdeletion.Whilepermutation
methodsusedtogenerategenome-widecutoffsaccuratelyreflectthetestingspace
amongobservedCNVs(veryrareCNVshavelittletonocontributiontothefamily-wise
errorrate),wewantedtoestimatetheproportionofCNVdetectableattheprobelevel.
Underourcurrentanalyticaldesignandsamplesize,wecalculatedthepowertodetect
associatedCNVacrossvariousMAFsandeffectsizesanddeterminetheproportionof
associationtestscapableofsurpassinggenome-widecorrection.
WesimulatedCNVswithinourdataset(21094casesand20227controls)andregressed
themusingthesameassociationdesignwithSCZresidualphenotypes.Wesimulated
variouseffectsizesbyrandomlysamplingcasesandcontrolsatdifferentprobabilitiesas
CNVcarriers,androundedtothenearestCNVcounttoreflecttheMAFofeachCNVin
thesample.ForeachcombinationofeffectsizeandMAF,weran1000simulations,
retrievingthet-testp-valueofCNVcarriersfromtheSCZresidualphenotype.Simulated
p-valuesbehavedinmuchthesamewayastheZ-scorecorrectiononpermutatedp-
valuesusedintheprimarytest(datanotshown).InExtendeddatafigure8,weshow
theproportionofsimulationsforCNVlossessurpassinggenome-widecorrectionateach
MAFandeffectsizeparameter(gainsperformsimilarly).
Wedefinestatisticalpowerastheproportionofsimulationssurpassinggenome-wide
significance.ForafullypenetrantriskCNV,werequireaMAFof~6e-4(orabout25CNV)
toachieve80%detectionpower.ForCNVwithagenotyperelativerisk(GRR)of10,we
requireaMAFof1e-3(oratleast41CNV)toachieve80%detectionpower.Looking
acrossthelandscapeofCNVstested,onthewholeabout10%ofdeletionorduplication
CNVbreakpointsreachafrequencygreaterthan1e-3inthesample.Ontheother
extreme,aCNVwithMAFof.005(oratleast206CNV)andaGRRof2willonlybe
detected58%ofthetime.
64
Gene-basednetworkanalysis
Toidentifyagenenetworkenrichedinschizophreniariskgenes,wequeried
GeneMANIA2usingthe17geneswithdeletiongene-testBenjamini-HochbergFDR<=
25%andmemberofthe“GOsynaptic”or“ARCcomplex”sets.Wethuscreateda
synapticproteininteractionnetworkof136genes,withthemostdenselyconnected
networkcorecorrespondingtopost-synapticdensityorganizers(DLGs,DLGAPs,
SHANKs)andionotropicglutamatereceptors(GRIAs,GRIDs,GRINs).NRXN1isconnected
tothenetworkcoreviaadhesionpartners(NLGN1-3)andCASK.Wetestedthis
schizophreniagenenetwork,andfoundsignificantenrichmentingeneswithevidenceof
denovocodingvariantsinsequencingstudiesofschizophreniatrios3(forframeshift,
stop-gainandsplice-site:Fisher’sExactTestp-value0.0023;missenseandaminoacid
insertion/deletion:Fisher’sExactTestp-value0.0004);inaddition,wefoundagreater
enrichmentforthisnetwork,comparedtothelargersetcomposedofall“GOsynaptic”
and“ARCcomplex”genes.Nosignificantenrichmentwasfoundfordenovovariants
identifiedincontrols.
FollowupofsignificantCNVloci
Bothgeneandprobelevelassociationfollowauniformtestingframeworkacrossthe
genome,howeverrisklocimayexhibitamorenuancedCNVarchitectureacrossthe
entiretyoftheassociationpeak.AllassociatedlociwithFDR<.05inthegenebasedtest
werefollowedupforfurthertesting,alongwithasmallnumberofcandidateloci
showingsuggestiveassociationintheprobe-levelassociation.Wevisuallyinspected
eachassociationpeakanddeterminedthebpcoordinatesthatencapsulatethe
associatedregionanddeterminewhichCNVsegmentinclusion,beitcoveringexonsor
overlappingaminimumpercentageofthetotalregion,mostappropriatelyreflectthe
associationsignal.Tocomprehensivelyexaminetherobustnessandsourceof
65
association,wealsoranadditionaltestscontrollingforindividualdataset,splittingby
sex,andexaminingadosagemodel,wherebycopynumberismeasuredwithonecopy
fordeletion,twocopiesfornoCNV,andthreecopiesforduplication.Wealsoexamined
significantCNVlociinanunfilteredCNVcallset,usingCNVscalledpriortotheremoval
ofcommonCNVs(MAF>1%)andCNVoverlappingsegmentalduplications.
Wefurtherevaluatedtheassociatedregionsbydeterminingtheconcordanceofcalls
withinthecallsetwiththosedeterminedbyunsupervisedclustering.CallsetCNVswere
definedasCNVswithatleasta50%overlapwithregionsinTable1.Werestrictedthis
analysisto26,959samplesacrosssixcohorts(14,419Affymetrix6.0,12,540Illumina
platforms;1.1:1case:controlratio).FeaturesforclusteringincludedthemedianlogR
ratio(mLRR)andthemedianlogRratioofthechromosomeforwhichalocusresidesin,
controllingforlargechromosomalabnormalities.WeimplementedDensity-Based
SpatialClusteringofApplications(DBSCAN)foundinthepythonscikit-learnlibrary
(http://scikit-learn.org)becauseofhighsensitivitytodetectoutliersinclusters.Foreach
novelregionandwithineachcohort,genotypeswereassignedtoeverysamplebased
ontheDBSCANdefinedcluster.Theclusterwiththehighestnumberofsampleswas
designatedasreferenceandassumedtohaveacopynumberoftwo.Otherclusters
wereflaggedasgainorlossbasedontheaverageregionalmLRRanditsrelationtothe
referenceregionalmLRR.WeremovedclusterswithaveragechromosomalmLRR
outside3SDfromthereference.CNVswereconsideredconcordantiftheywereflagged
non-referencebyDBSCANandpresentinthe41kcallset,matchingonCNVtype.We
appliedalocusbasedcallsetconcordancefilterof>=70%;oneregion,NPY4R,failedto
meetthisrequirementwithaconcordanceof0.1%.Inaddition,bothproximalanddistal
lociofZNF600wereremovedduetobatcheffects,whichwedefinedasasignificant
deviationfromaPoissondistributionofcallsetcallsperplate.Regionsthatpassedboth
concordanceandbatcheffectfiltersarereportedinTable1.
66
ProportionofvarianceinSCZexplainedbytopCNVloci
TomeasuretheproportionofvarianceexplainedontheliabilityscaleofSCZ,we
estimatedtheoverallheritabilityofliability(orlogRRgeneticvariance)explainedbythe
eightCNVlocisurpassinggenome-widesignificance.Alleightlociwerecollapsedintoa
singlesignal.TwoSCZaffectedindividualswerefoundtocarrytwoCNVsintheseloci,
andtheircontributionwasonlycountedonce.Insum,weobserved298SCZpatients
withaCNVintheseregions(1.4%ofthetotalSCZaffectedsample),and29controls
(0.1%;CMHstratifiedOR=10.1).ToestimatethevarianceinSCZliabilityexplainedby
locisurpassinggenome-widecorrection,wecalculatedtheheritabilityofliabilityusing
theINDI-Vonlinetool(cnsgenomics.com/software)describedin4usinganoverall
diseaseriskof1%andasiblingrecurrenceriskof8.85.
NAHRenrichmentinsignificantnovelgeneloci
Totestifnovelsignificantloci(FDR<0.05;Table1)wereenrichedforNAHRevents,we
performedapermutationtest(n=10,000)simulatingthenulldistributionofNAHR-
mediatedCNVsforasetofrandomloci.Eachsimulationrandomlyselectednineloci
takenfromCNVsoverlappingatleast50%togenesinthegene-setburdenanalysis.
TheseninerandomlociwerematchedaccordingtoCNVcallfrequencytotheninenovel
significantlociinTable1.Wethencreatedwindowsforeachstartandendpositionfor
everyoverlappingCNVtoarandomlocus.Startpositionswereexpanded-50kband
+5kb,andendpositionswereexpanded-5kband+50kb.WeflaggedCNVsasNAHR-
mediatedwhenbothstartandendexpandedwindowsoverlappedto1kbsegmental
duplicationsobtainedfromthehg18buildoftheUCSCtablebrowser
(https://genome.ucsc.edu/cgi-bin/hgTables).Everyiterationreportedthefractionof
NAHR-mediatedCNVs;thatistheratioofCNVsflaggedasNAHRtothetotalnumberof
overlappingCNVs.WefoundandenrichmentofNAHRmediatedCNVsinsignificant
novellociwhencomparedtothenulldistribution(86%NAHR-mediated,6fold
enrichment,p=0.008).
67
ConsortiumMembership
WellcomeTrustCase-ControlConsortium2
ManagementCommittee:PeterDonnelly180,217,InesBarroso218,JeneferM
Blackwell219,220,ElviraBramon196,MatthewABrown221,JuanPCasas222,223,
AidenCorvin5,PanosDeloukas218,AudreyDuncanson224,JanuszJankowski225,
HughSMarkus226,ChristopherGMathew227,ColinNAPalmer228,RobertPlomin9,
AnnaRautanen180,StephenJSawcer229,RichardCTrembath227,AnanthC
Viswanathan230,231,NicholasWWood232.
DataandAnalysisGroup:ChrisCASpencer180,GavinBand180,CélineBellenguez180,
PeterDonnelly180,217,ColinFreeman180,EleniGiannoulatou180,GarrettHellenthal
180,RichardPearson180,MattiPirinen180,AmyStrange180,ZhanSu180,Damjan
Vukcevic180.
DNA,Genotyping,DataQC,andInformatics:CordeliaLangford218,InesBarroso218,
HannahBlackburn218,SuzannahJBumpstead218,PanosDeloukas218,SergeDronov
218,SarahEdkins218,MatthewGillman218,EmmaGray218,RhianGwilliam218,
NaomiHammond218,SarahEHunt218,AlagurevathiJayakumar218,JenniferLiddle
218,OwenTMcCann218,SimonCPotter218,RadhiRavindrarajah218,Michelle
Ricketts218,AvazehTashakkori-Ghanbaria218,MatthewWaller218,PaulWeston218,
PamelaWhittaker218,SaraWidaa218.PublicationsCommittee:ChristopherGMathew
227,JeneferMBlackwell219,220,MatthewABrown221,AidenCorvin5,MarkI
McCarthy233,ChrisCASpencer180.
PsychosisEndophenotypeInternationalConsortium
MariaJArranz156,234,StevenBakker101,StephanBender235,236,ElviraBramon
156,237,238,DavidACollier8,9,BenedictoCrespo-Facorro239,240,JeremyHall134,
ConradIyegbe156,AssenVJablensky241,RenéSKahn101,LubaKalaydjieva102,242,
StephenLawrie134,CathrynMLewis156,KuangLin156,DonHLinszen243,Ignacio
Mata239,240,AndrewMMcIntosh134,RobinMMurray142,RoelAOphoff80,Jim
68
VanOs143,156,JohnPowell156,DanRujescu81,83,MurielWalshe156,Matthias
Weisbrod236,DurkWiersma244.217
DepartmentofStatistics,UniversityofOxford,Oxford,UK.218WellcomeTrustSanger
Institute,WellcomeTrustGenomeCampus,Hinxton,Cambridge,UK.219Cambridge
InstituteforMedicalResearch,UniversityofCambridgeSchoolofClinicalMedicine,
Cambridge,UK.220TelethonInstituteforChildHealthResearch,CentreforChildHealth
Research,UniversityofWesternAustralia,Subiaco,WesternAustralia,Australia.221
DiamantinaInstituteofCancer,ImmunologyandMetabolicMedicine,Princess
AlexandraHospital,UniversityofQueensland,Brisbane,Queensland,Australia.222
DepartmentofEpidemiologyandPopulationHealth,LondonSchoolofHygieneand
TropicalMedicine,London,UK.223DepartmentofEpidemiologyandPublicHealth,
UniversityCollegeLondon,London,UK.224MolecularandPhysiologicalSciences,The
WellcomeTrust,London,UK.225PeninsulaSchoolofMedicineandDentistry,Plymouth
University,Plymouth,UK.226ClinicalNeurosciences,StGeorge'sUniversityofLondon,
London,UK.227DepartmentofMedicalandMolecularGenetics,SchoolofMedicine,
King'sCollegeLondon,Guy'sHospital,London,UK.228BiomedicalResearchCentre,
NinewellsHospitalandMedicalSchool,Dundee,UK.229DepartmentofClinical
Neurosciences,UniversityofCambridge,Addenbrooke'sHospital,Cambridge,UK.230
InstituteofOphthalmology,UniversityCollegeLondon,London,UK.231National
InstituteforHealthResearch,BiomedicalResearchCentreatMoorfieldsEyeHospital,
NationalHealthServiceFoundationTrust,London,UK.232DepartmentofMolecular
Neuroscience,InstituteofNeurology,London,UK.233OxfordCentreforDiabetes,
EndocrinologyandMetabolism,ChurchillHospital,Oxford,UK.234Fundacióde
DocènciaiRecercaMútuadeTerrassa,UniversitatdeBarcelona,Spain.235Childand
AdolescentPsychiatry,UniversityofTechnologyDresden,Dresden,Germany.236
SectionforExperimentalPsychopathology,GeneralPsychiatry,Heidelberg,Germany.
237InstituteofCognitiveNeuroscience,UniversityCollegeLondon,London,UK.238
MentalHealthSciencesUnit,UniversityCollegeLondon,London,UK.239Centro
69
InvestigaciónBiomédicaenRedSaludMental,Madrid,Spain.240UniversityHospital
MarquésdeValdecilla,InstitutodeFormacióneInvestigaciónMarquésdeValdecilla,
UniversityofCantabria,Santander,Spain.241CentreforClinicalResearchin
Neuropsychiatry,TheUniversityofWesternAustralia,Perth,WesternAustralia,
Australia.242WesternAustralianInstituteforMedicalResearch,TheUniversityof
WesternAustralia,Perth,WesternAustralia,Australia.243DepartmentofPsychiatry,
AcademicMedicalCenter,UniversityofAmsterdam,Amsterdam,TheNetherlands.244
DepartmentofPsychiatry,UniversityMedicalCenterGroningen,Universityof
Groningen,TheNetherlands.
Acknowledgements
DataProcessingandStatisticalanalyseswerecarriedoutontheGeneticCluster
Computer(http://www.geneticcluster.org)hostedbySURFsaraandfinancially
supportedbytheNetherlandsScientificOrganization(NWO480-05-003)alongwitha
supplementfromtheDutchBrainFoundationandtheVUUniversityAmsterdam.The
GRASdatacollectionwassupportedbytheMaxPlanckSociety,theMax-Planck-
Förderstiftung,andtheDFGCenterforNanoscaleMicroscopy&MolecularPhysiologyof
theBrain(CNMPB),Göttingen,Germany.TheBostonCIDARsubjectanddatacollection
wassupportedbytheNationalInstituteofMentalHealth(1P50MH080272,RWM;
U01MH081928,LJS;1R01MH092380,TLP)andtheMassachusettsGeneralHospital
ExecutiveCommitteeonResearch(TLP).ISC–Portugal:CNPandMTPareorhavebeen
supportedbygrantsfromtheNIMH(MH085548,MH085542,MH071681,MH061884,
MH58693,andMH52618)andtheNCRR(RR026075).CNP,MTP,andAHFareorhave
beensupportedbygrantsfromtheDepartmentofVeteransAffairsMeritReview
Program.TheDanishAarhusstudywassupportedbygrantsfromTheLundbeck
Foundation,TheDanishStrategicResearchCouncil,AarhusUniversity,andTheStanley
ResearchFoundation.WorkinCardiffwassupportedbyMRCCentre(G0800509)and
MRCProgramme(G0801418)Grants,theEuropeanCommunity'sSeventhFramework
Programme(HEALTH-F2-2010-241909(ProjectEU-GEI)),theEuropeanUnionSeventh
70
FrameworkProgramme(FP7/2007-2013)undergrantagreementn°279227,a
fellowshiptoJWfromtheMRC/WelshAssemblyGovernmentandtheMargaretTemple
AwardfromtheBritishMedicalAssociation.WethankNovartisfortheirinputin
obtainingCLOZUKsamples,andstaffatTheDoctor'sLaboratory(LisaLevett/Andrew
Levett)forhelpwithsampleacquisitionanddatalinkageandinCardiff(Kiran
Mantripragada/LucindaHopkins)forsamplemanagement.CLOZUKandsomeother
samplesweregenotypedattheBroadInstitute(whichhasaseparateacknowledgment)
orbytheWTCCCandWTCCC2(WT(083948/Z/07/Z).WeacknowledgeuseoftheBritish
1958BirthCohortDNA(MRC:G0000934)andtheWellcomeTrust(068545/Z/0/and
076113/C/04/Z),theUKBloodServicesCommonControls(UKBS-CCcollection),funded
bytheWT(076113/C/04/Z)andbyNIHRprogrammegranttoNHSBT(RP-PG-0310-
1002).VirginiaCommonwealthUniversity:BPRandKSKthankallthefacultyofthe
VirginiaInstituteforPsychiatricandBehavioralGeneticsforinvaluableinsightsand
discussionsovermanyyears.BSM,SAB,BTW,BW,KSKandBPRweresupportedby
NationalInstituteofMentalHealthgrantR01MH083094toBPR.Samplecollectionwas
supportedbypreviousfundingofNationalInstituteofMentalHealthgrantR01
MH041953toKSKandBPR.GenotypingwassupportedbyNationalInstituteofMental
HealthgrantR01MH083094toBPR,NationalInstituteofMentalHealthgrantR01
MH068881toBPRandWellcomeTrustCaseControlConsortium2grant.Wethank
NovartisfortheirinputinobtainingCLOZUKsamples,andstaffatTheDoctor's
Laboratory(LisaLevett/AndrewLevett)forhelpwithsampleacquisitionanddata
linkageandinCardiff(KiranMantripragada/LucindaHopkins)forsamplemanagement.
Ourworkwassupportedby:MedicalResearchCouncil(MRC)Centre(G0800509;
G0801418),theEuropeanCommunity'sSeventhFrameworkProgramme(HEALTH-F2-
2010-241909(ProjectEU-GEI)),theEuropeanUnionSeventhFrameworkProgramme
(FP7/2007-2013)undergrantagreementn°279227,afellowshiptoJWfromthe
MRC/WelshAssemblyGovernmentandtheMargaretTempleAwardfromtheBritish
MedicalAssociation.CLOZUKandsomeothersamplesweregenotypedattheBroad
Institute(whichhasaseparateacknowledgment)orbytheWTCCCandWTCCC2(WT
71
(083948/Z/07/Z).WeacknowledgeuseoftheBritish1958BirthCohortDNA(MRC:
G0000934)andtheWellcomeTrust(068545/Z/0/and076113/C/04/Z),theUKBlood
ServicesCommonControls(UKBS-CCcollection),fundedbytheWT(076113/C/04/Z)and
byNIHRprogrammegranttoNHSBT(RP-PG-0310-1002).Therecruitmentoffamiliesin
BulgariawasfundedbytheJanssenResearchFoundation,Beerse,Belgium.Weare
gratefultothestudyvolunteersforparticipatingintheJanssenresearchstudiesandto
thecliniciansandsupportstaffforenablingpatientrecruitmentandbloodsample
collection.Informedconsentwasobtainedfromallparticipantsortheirparentsor
guardians.WethankthestaffintheNeuroscienceBiomarkersGenomicLabledby
ReynaFavisatJanssenforsampleprocessingandthestaffatIlluminaforgenotyping
JanssenDNAsamples.WealsothankAnthonySantos,NicoleBottrel,Monique-Andree
Franc,WilliamCaffertyofJanssenResearch&Development)foroperationalsupport.
FundingfromtheNetherlandsOrganizationforHealthResearchandDevelopment
(ZonMw),withintheMentalHealthprogram(toGROUPconsortiumforcollecting
patientsandclinicaldata).High-DensityGenome-WideAssociationStudyOf
SchizophreniaInLargeDutchSample(R01MH078075NIH/NationalInstituteOfMental
HealthPI:RoelA.Ophoff).TheDanishCouncilforStrategicResearch(Journ.nr.09-
067048);TheDanishNationalAdvancedTechnologyFoundation(Journ.nr.001-2009-2);
TheLundbeckFoundation(Journ.nr.R24-A3243);EU7thFrameworkProgramme
(PsychGene;Grantagreementnr.218251);EU7thFrameworkProgramme(PsychDPC;
Grantagreementnr.286213).TheWellcomeTrustsupportedthisstudyaspartofthe
WellcomeTrustCaseControlConsortium2project.E.BramonholdsaMRCNew
InvestigatorAwardandaMRCCentenaryAward.TheTOPStudywassupportedbythe
ResearchCouncilofNorway(#213837,#217776,#223273),South-EastNorwayHealth
Authority(#2013-123)andK.G.JebsenFoundation.Thisworkwassupportedbythe
DonaldandBarbaraZuckerFoundation,theNorthShore–LongIslandJewishHealth
SystemFoundation,andgrantsfromtheStanleyFoundation(AKM),theNational
AllianceforResearchonSchizophreniaandDepression(AKM),andtheNIH(MH065580
toTL;MH001760toAKM).SynSys,EUFP7-242167,SigridJuseliusFoundation,The
72
AcademyofFinland,grantnumber:251704,SohlbergFoundation.TheSwedish
ResearchCouncil[grantnumbers2006-4472,2009-5269,2009-3413]andtheCounty
CouncilsofVästerbottenandNorrbotten,Swedensupportedthecollectionofthe
scz_umeb_eurandscz_umes_eursamples.TheBetulaStudy,fromwhichtheUmea
controlswererecruited,issupportedbygrantsfromtheSwedishResearchCouncil
[grantnumbers345-2003-3883,315-2004-6977]andtheBankofSwedenTercentenary
Foundation,theSwedishCouncilforPlanningandCoordinationofResearch,the
SwedishCouncilforResearchintheHumanitiesandSocialSciencesandtheSwedish
CouncilforSocialResearch.TheGRAS(GöttingenResearchAssociationfor
Schizophrenia)datacollectionhasbeensupportedbytheMaxPlanckSociety,theMax
PlanckFörderstiftung,andtheDFG(CNMPB).WethankallGRASpatientsfor
participatinginthestudy,andallthemanycolleagueswhohavecontributedoverthe
past10yearstotheGRASdatacollection.WeacknowledgesupportfromtheNorth
Shore–LIJHealthSystemFoundationandNIHgrantsRC2MH089964andR01
MH084098.WeacknowledgesupportfromNIMHK01MH085812(PIKeller)andNIMH
R01MH100141(PIKeller).EGCUTworkwassupportedbytheTargetedFinancingfrom
theEstonianMinistryofScienceandEducation[SF0180142s08];theUSNational
InstituteofHealth[R01DK075787];theDevelopmentFundoftheUniversityofTartu
(grantSP1GVARENG);theEuropeanRegionalDevelopmentFundtotheCentreof
ExcellenceinGenomics(EXCEGEN;grant3.2.0304.11-0312);andthroughFP7grant
313010.MilanMacekwassupportedbyCZ.2.16/3.1.00/24022OPPK,NT/13770–4and
00064203FNMotol.Forthescz_tcr1_asndatasetfundingfromtheNationalMedical
ResearchCouncil(Grant:NMRC/TCR/003/2008)andtheBiomedicalResearchCouncil,
A*STARisacknowledged.GenotypingoftheSwedishHubinsamplewasperformedby
theSNP&SEQTechnologyPlatforminUppsala,whichissupportedbyUppsala
University,UppsalaUniversityHospital,ScienceforLifeLaboratory-Uppsalaandthe
SwedishResearchCouncil(Contracts80576801and70374401).TheSwedishHubin
samplewassupportedbySwedishResearchCouncil(IA,EGJ)andtheregional
agreementonmedicaltrainingandclinicalresearchbetweenStockholmCountyCouncil
73
andtheKarolinskaInsititutet(EGJ).B.J.M.,V.J.C.,R.J.S.,S.V.C.,F.A.H.,A.V.J.,C.M.L.,
P.T.M.,C.P.,andU.S.weresupportedbytheAustralianSchizophreniaResearchBank,
whichissupportedbyanEnablingGrantfromtheNationalHealthandMedicalResearch
Council(Australia)[No.386500],thePrattFoundation,RamsayHealthCare,theViertel
CharitableFoundationandtheSchizophreniaResearchInstituteandtheNSW
DepartmentofHealth.C.P.issupportedbyaSeniorPrincipalResearchFellowshipfrom
theNationalHealthandMedicalResearchCouncil(Australia).Weacknowledgethehelp
of:JohannaBadcock,LindaBradbury,JasonBridge,DavidChandler,JanellCollins-
Langworthy,TrishCollinson,MilanDragovic,CherylFilippich,DavidHawkes,Danielle
Lowe,KathrynMcCabe,TamaraMacDonald,BarryMaher,BhartiMorarMarcSeal,
HeatherSmith,MelissaTooney,PaulTooney,andMelindaZiino.TheDanishAarhus
studywassupportedbygrantsfromTheLundbeckFoundation,TheDanishStrategic
ResearchCouncil,AarhusUniversity,andTheStanleyResearchFoundation.ThePerth
samplecollectionwasfundedbyAustralianNationalHealthandMedicalResearch
CouncilprojectgrantsandtheAustralianSchizophreniaResearchBank.The
Bonn/Mannheimsamplewasgenotypedwithinastudythatwassupportedbythe
GermanFederalMinistryofEducationandResearch(BMBF)throughtheIntegrated
GenomeResearchNetwork(IG)MooDS(SystematicInvestigationoftheMolecular
CausesofMajorMoodDisordersandSchizophrenia;grant01GS08144toM.M.N.and
S.C.,grant01GS08147toM.R.),undertheauspicesoftheNationalGenomeResearch
Networkplus(NGFNplus),andthroughtheIntegratedNetworkIntegraMent(Integrated
UnderstandingofCausesandMechanismsinMentalDisorders),undertheauspicesof
thee:MedProgramme.(GSKcontrolsample;Müller-Myhsok).Thisworkhasbeen
fundedbytheBavarianMinistryofCommerceandbytheFederalMinistryofEducation
andResearchintheframeworkoftheNationalGenomeResearchNetwork,
Förderkennzeichen01GS0481andtheBavarianMinistryofCommerce.M.M.N.isa
memberoftheDFG-fundedExcellence-ClusterImmunoSensation.M.M.N.alsoreceived
supportfromtheAlfriedKruppvonBohlenundHalbach-Stiftung.M.R.wasalso
supportedbythe7thFrameworkProgrammeoftheEuropeanUnion(ADAMSproject,
74
HEALTH-F4-2009-242257;CRESTARproject,HEALTH-2011-1.1-2)grant279227.Roche:
ThanksareexpressedtoOliviaSpleissforgreatsupportingeneticdatageneration,
DanielUmbrichtandDelphineLagardefortheirvaluablesupportinclinicalandgenetic
datasharing,andAnirvanGhoshforcontinuousencouragement.Authorsalsowishto
thankallinvestigatorsandpatientswhoparticipatedintheRocheclinicalstudies.Jo
KnightholdstheJoanneMurphyProfessorinBehaviouralScience.WethankMaria
Tampakerasforherworkonthesamples.TheStanleyCenterforPsychiatricResearchat
theBroadInstituteacknowledgesfundingfromtheStanleyMedicalResearchInstitute.
Swedishschizophreniastudy(PICMM,PFS,PS,SM):Wearedeeplygratefulforthe
participationofallsubjectscontributingtothisresearchandtothecollectionteamthat
workedtorecruitthem:E.Flordal-Thelander,A.-B.Holmgren,M.Hallin,M.Lundin,A.-K.
Sundberg,C.Pettersson,R.Satgunanthan-Dawoud,S.Hassellund,M.Rådstrom,B.
Ohlander,L.NyrénandI.Kizling.FundingsupportfortheSwedenSchizophreniaStudy
(PIsHultman,Sullivan,andSklar)wasprovidedbytheNIMH(R01MH077139toP.F.S.
andR01MH095034toP.S.),theStanleyCenterforPsychiatricResearch,theSylvan
HermanFoundation,theFriedmanBrainInstituteattheMountSinaiSchoolof
Medicine,theKarolinskaInstitutet,KarolinskaUniversityHospital,theSwedishResearch
Council,theSwedishCountyCouncil,theSöderströmKönigskaFoundation.We
acknowledgeuseofDNAfromTheUKBloodServicescollectionofCommonControls
(UKBScollection),fundedbytheWellcomeTrustgrant076113/CI04/Z,bytheJuvenile
DiabetesResearchFoundationgrantWT0618S8,andbytheNationalInstituteofHealth
ResearchofEngland.ThecollectionwasestablishedaspartoftheWellcomeTrustCase-
ControlConsortium.Wethankthestudyparticipants,andtheresearchstaffatthestudy
sites.ThisstudywassupportedbyNIMHgrantR01MH062276(toDFLevinson,C
Laurent,MOwenandDWildenauer),grantR01MH068922(toPVGejman),grant
R01MH068921(toAEPulver)andgrantR01MH068881(toBRiley).Theauthorsare
gratefultothemanyfamilymemberswhoparticipatedinthestudiesthatrecruited
thesesamples,tothemanyclinicianswhoassistedintheirrecruitment.Inadditionto
thesupportacknowledgedfortheMulticenterGeneticsStudiesofSchizophreniaand
75
MolecularGeneticsofSchizophreniastudies,Dr.DFLevinsonreceivedadditional
supportfromtheWalterE.Nichols,M.D.,ProfessorshipintheSchoolofMedicine,the
EleanorNicholsEndowment,theWalterF.&RachaelL.NicholsEndowmentandthe
WilliamandMaryMcIvorEndowment,StanfordUniversity.Thisstudywassupportedby
NIHR01grants(MH67257toN.G.B.,MH59588toB.J.M.,MH59571toP.V.G.,MH59565
toR.F.,MH59587toF.A.,MH60870toW.F.B.,MH59566toD.W.B.,MH59586toJ.M.S.,
MH61675toD.F.L.,MH60879toC.R.C.,andMH81800toP.V.G.),NIHU01grants
(MH46276toC.R.C.,MH46289toC.Kaufmann,MH46318toM.T.Tsuang,MH79469to
P.V.G.,andMH79470toD.F.L.),theGeneticAssociationInformationNetwork(GAIN),
andbyThePaulMichaelDonovanCharitableFoundation.Genotypingwascarriedoutby
theCenterforGenotypingandAnalysisattheBroadInstituteofHarvardandMIT(S.
GabrielandD.B.Mirel),whichissupportedbygrantU54RR020278fromtheNational
CenterforResearchResources.GenotypingofhalfoftheEAsampleandalmostallthe
AAsamplewascarriedoutwithsupportfromGAIN.TheGAINqualitycontrolteam(G.R.
AbecasisandJ.Paschall)madeimportantcontributionstotheproject.WethankS.
PurcellforassistancewithPLINK.We(DRW,RS)thankthestaffoftheLieberInstitute
andtheClinicalBrainDisordersBranchoftheIRP,NIMHfortheirassistanceindata
collectionandmanagement.WethankNingpingFengandBhaskarKolachanafor
IlluminagenotypingandformanagingDNAstocks.Theworkwassupportedbythe
LieberInstituteandbydirectNIMHIRPfundingoftheWeinbergerLab.Pfizerisvery
gratefultothestudyvolunteersforparticipatinginourresearchstudies.Wethankour
numerouscliniciansandsupportstaffforenablingpatientrecruitment,bloodsample
collection,andbiospecimenadministration.Informedconsentwasobtainedfromall
participants,theirparentsorguardians.EliLillyisgratefultotheparticipantsofclinical
trialsandresearchstudieswhogaveconsentforparticipationinthisstudy.Wearealso
gratefultoPhilipJEbertandJeffreySArnoldforfacilitatingourparticipationinthis
project.WeacknowledgetheIrishcontributiontotheInternationalSchizophrenia
Consortium(ISC)study,theWTCCC2SCZstudy&WTCCC2controlsfromthe1958BCand
UKNBS,theScienceFoundationIreland(08/IN.1/B1916).WethanktheTorontoCentre
76
forAppliedGenomicsfortechnicalandcomputationalassistanceandfundingfromthe
UniversityofTorontoMcLaughlinCentreandGenomeCanada.S.W.S.holdsthe
GlaxoSmithKline-CIHRChairinGenomeSciencesattheHospitalforSickChildrenand
UniversityofToronto.WeacknowledgeuseoftheTrinityBiobanksamplefromtheIrish
BloodTransfusionService&theTrinityCentreforHighPerformanceComputing.
FundingforthisstudywasprovidedbytheWellcomeTrustCaseControlConsortium2
project(085475/B/08/Zand085475/Z/08/Z),theWellcomeTrust(072894/Z/03/Z,
090532/Z/09/Zand075491/Z/04/B),NIMHgrants(MH41953andMH083094)and
British1958BirthCohortDNAcollectionfundedbytheMedicalResearchCouncil(grant
G0000934)andtheWellcomeTrust(grant068545/Z/02)andoftheUKNationalBlood
ServicecontrolsfundedbytheWellcomeTrust.WeacknowledgeHongKongResearch
GrantsCouncilprojectgrantsGRF774707M,777511M,776412Mand776513M.
Supplementarydatareferences
1. Desachy,G.etal.Increasedfemaleautosomalburdenofrarecopynumber
variantsinhumanpopulationsandinautismfamilies.MolPsychiatry20,170-5(2015).
2. Zuberi,K.etal.GeneMANIApredictionserver2013update.NucleicAcidsRes41,W115-22(2013).
3. Fromer,M.etal.Denovomutationsinschizophreniaimplicatesynapticnetworks.Nature506,179-84(2014).
4. Witte,J.S.,Visscher,P.M.&Wray,N.R.Thecontributionofgeneticvariantstodiseasedependsontheruler.NatRevGenet15,765-76(2014).
5. Sullivan,P.F.,Kendler,K.S.&Neale,M.C.Schizophreniaasacomplextrait:evidencefromameta-analysisoftwinstudies.Arch.Gen.Psychiatry.60,1187-1192(2003).