DisGeNET Cytoscape App · 2020. 5. 18. · Then, go again to Apps in the Cytoscape menu, and click...
Transcript of DisGeNET Cytoscape App · 2020. 5. 18. · Then, go again to Apps in the Cytoscape menu, and click...
DisGeNETCytoscapeApp
IBI Lab [2021]
2
USERGUIDE©2010-2021,IntegrativeBiomedicalInformaticsIntegrativeBiomedicalInformaticsLaboratoryResearchGrouponBiomedicalInformatics(GRIB)IMIM-HospitaldelMarMedicalResearchInstituteUniversitatPompeuFabra(UPF)PRBBDr.Aiguader88,E-08003Barcelona(SPAIN)
3
TheDisGeNETappisdistributedundertheGNUGPL3.0license.MoredetailsabouttheGNUGeneralPublicLicense3.0isavailablehere.
4
DisGeNETCytoscapeAppUserGuide(version7.0.x)
1. Installationguide.........................................................................................................................5a. DownloadandinstalltheDisGeNETCytoscapeApp...............................................5
2. BriefdescriptionofDisGeNET...............................................................................................8a. Originaldatasources.............................................................................................................8b. RepresentingDisGeNETdatausingnetworks...........................................................8c. Vocabularymapping..............................................................................................................8d. DisGeNETgene-diseaseassociationtypeontology.................................................9
3. Tutorial..........................................................................................................................................10a. Basicfunctions......................................................................................................................10i. Generategene-diseasenetworks.............................................................................10ii. Generatevariant-diseasenetworks........................................................................11iii. CreatenetworksbyDisGeNETassociationtype..........................................12iv. Createnetworksbydiseaseclass............................................................................12v. Createnetworksbygene,disease,orvariant.....................................................14vi. MultipleentitysearchintheDisGeNETApp.......................................................14
b. Advancedfunctions............................................................................................................19i. Colouringnodesbydiseaseclass.............................................................................19ii. DisGeNETexpand...........................................................................................................20iii. DisGeNETLinkouts...................................................................................................22iv. AnnotatingforeignnetworkswithDisGeNETdata..........................................23
c. DisGeNETautomation.......................................................................................................29i. UsingtheDisGeNETautomationinR.....................................................................31ii. UsingtheDisGeNETautomationinpython.........................................................32
4. TheSQLitedatabase................................................................................................................335. ThenodeandedgetablesintheCytoscapeTablePanel.........................................346. Annexes.........................................................................................................................................367. Citation..........................................................................................................................................418. References...................................................................................................................................429. Contact...........................................................................................................................................4310. Funding....................................................................................................................................4411. License......................................................................................................................................4512. Aboutthisdocument..........................................................................................................45
5
1. Installationguide
a. DownloadandinstalltheDisGeNETCytoscapeApp
Equipment:apersonalcomputerwithInternetaccessandanInternetbrowser.OperatingSystem:TheDisGeNETAppandCytoscapearesupportedonWindows(Windows7,Windows8,Windows10),MacandLinux.Java standard edition: Version 1.8 or higher is required (available athttp://www.java.com/).CytoscapeversionDisGeNETiscompatiblewiththeCytoscape3.xversions.WerecommendCytoscapeversions(3.6.x)orlater.Thestepsfordownloadingandinstalling the latest version of Cytoscape are described athttp://www.cytoscape.org/.Appversion:7.xTheDisGeNETappneedstobeinstalledfromtheCytoscapeAppStore.● GotoAppsintheCytoscapemenu● ClickonAppManager● typedisgenetonthesearchbox● clickontheresult,andthenclickinstall(Figure1)●
Figure1:InstallingtheDisGeNETappfromtheCytoscapeAppStore
6
Then, go again to Apps in the Cytoscapemenu, and click on DisGeNET-> StartDisGeNET(Figure2).
Figure2:StartingtheDisGeNETapp
ThefirsttimethattheDisGeNETappisstarted,itwillaskforthedirectoryofthedatabasefile.Ifthedatabasefiledoesnotexistinthedirectory,itwillproceedtoautomatically download it. Choose a directory where the database will bedownloadedandunpacked(disgenet_2020.db~1.3Gb).Forexample,Downloads(Figure3).Foradetaileddescriptionofthedatabase,seesection4.
Figure3:ConfigureDisGeNETappforthefirstrun
Amessageof“Databasefiledoesnotexist.Goingtodownloadit”willappear(Figure4).
Figure4:ConfigureDisGeNETdatabaseforthefirstrun
7
Thedownloadmighttakeseveralminutes.Please,bepatient.Whenit finishes,anew message will appear “Database downloaded correctly.” The app will startafterwards.Thisalsotakesacoupleofminutes.TheappisreadytobeusedwhentheDisGeNETControlPanelappears(Figure5).
Figure5:TheDisGeNETappcontrolpanel
8
2. BriefdescriptionofDisGeNETDisGeNETisadiscoveryplatformthatintegrateshumangeneandvariant-diseaseassociations fromvarious expert curateddatabases and the scientific literature,and includes Mendelian, rare, complex and environmental diseases, as well asabnormalphenotypesandtraits(1–3).
a. OriginaldatasourcesInDisGeNET,theGDAsaregroupedaccordingtotheirtypeandlevelofcuration:CURATED(containinggene-diseaseassociationsfromhumanexpertcurateddatasources), ANIMAL MODELS (containing gene-disease associations from animalmodelrepositories), INFERRED(containinggene-diseaseassociationsfromHPO,GWASDB and GWASCAT), and ALL (including CURATED, ANIMAL MODELS,INFERRED,anddataderivedfromtextminingthebiomedicalliterature).DisGeNETVDAs are grouped according to their type and level of curation: CURATED(containingvariant-diseaseassociationsfromhumanexpertcurateddatasources),and ALL (including CURATED data and data derived from text mining thebiomedical literature). For the up-to-date list and description of data sourcesavailable inDisGeNET,pleasevisit theDisGeNETDiscoveryPlatformWebsiteathttp://disgenet.org/dbinfo,section“OriginalDataSources”.
b. RepresentingDisGeNETdatausingnetworks
Gene-disease associations (GDAs), and variant-disease associations (VDAs) arecollectedfromseveralsources.Thesourcedatabasesusedifferentvocabularies.InordertomergeallGDAandVDAsandtopresenttheminonecomprehensivegene-disease,orvariant-diseasenetwork,we(i)mappedgeneidentifierstoNCBIEntrezGeneidentifiersifnecessary,(ii)mappeddiseasevocabularytermstotheUnifiedMedicalLanguageSystem® (UMLS®)ConceptUnique Identifiers (CUIs), and (iii)integratedassociationsthroughtheDisGeNETgene-diseaseassociationontology(seesection2.4).The data contained in DisGeNET is represented using bipartite graphs. TheDisGeNETbipartitegraphhastwotypesofvertices(genesorvariantsanddiseases)andtheedgesconnecttheverticesofdifferenttypes(e.g.agenewithadisease).Thesebipartitegraphsaremultigraphsinwhichtwoverticesmightbeconnectedbymore than one edge. Thesemultiple edges represent themultiple evidencesreportingtheGDAorVDA.
c. VocabularymappingFortheup-to-datedescriptionofthediseaseandgenevocabularymappingsused
9
in DisGeNET please visit the DisGeNET Discovery Platform Website at:http://disgenet.org/dbinfo,section“DataAttributes”.
d. DisGeNETgene-diseaseassociationtypeontology
We have developed the DisGeNET gene-disease association type ontology torepresent inauniformandstructuredwaythetypesofrelationsbetweengenesanddiseases found intheoriginaldatasources(Figure6).Forthedetailsof theontologyusedtodescribegene-diseaseassociationsinDisGeNETpleasevisittheDisGeNET Discovery Platform Website at: http://disgenet.org/dbinfo, sectionDisGeNETgene-diseaseassociationtypeontology.
Figure6:DisGeNETgene-diseaseassociationtypeontology
10
3. TutorialTheDisGeNETCytoscapeappisdesignedtovisualize,queryandanalyseanetworkrepresentationofthegene-diseaseandthevariant-diseaseassociationscontainedinDisGeNET.Theappusesthecurrentversion(v7.0)oftheDisGeNETthatcontains1,134,942 gene-disease associations (GDAs), between 21,671 genes and 30,170diseases, disorders, traits, and clinical or abnormal human phenotypes, and369,554 variant-disease associations (VDAs), between 194,515 variants and14,155diseases,traits,andphenotypes.TheDisGeNETapp(versions5.1orlater)includestheDisGeNETCytoscapestyleby default. Additionally, the style can be downloaded fromhttp://disgenet.org/app,section“Additionalfiles”.
a. BasicfunctionsTheDisGeNETAppControlPanel(Figure5)allowsadjustingtheparametersofthequeriesinordertocreatedifferenttypesofnetworks.ThepanelcontainstwotabstogeneratetheGeneDiseaseNetwork,andtheVariantDiseaseNetwork.TheGeneDiseaseNetwork tab isdisplayedbydefault. In this tab, differentGDAnetworkscanbegeneratedbyselectingdifferentdataSources,AssociationTypesand/orDiseaseClassesfromtheirrespectivedrop-downmenus.TheGDAnetworksmaybealsofilteredusingacut-offvalueoftheDisGeNETscore,and/orEvidenceIndex(EI)andEvidenceLevel(EL).Inaddition,GDAnetworkscanbebuiltaroundspecificdisease(s)orgene(s)of interestusing theSearch boxesprovided in thepanel.Using theVariant-DiseaseNetwork,differentVDAnetworkscanbegeneratedbyselectingdifferentdataSources,AssociationTypesand/orDiseaseClassesfromtheirrespectivedrop-downmenus.TheVDAnetworksmaybealsofilteredusingacut-off value of the DisGeNET score, and/or Evidence Index (EI). In addition, VDAnetworkscanbebuiltaroundspecificdisease(s),gene(s),orvariant(s)ofinterestusingtheSearchboxesprovidedinthepanel.
i. Generategene-diseasenetworksIn order to obtain aGDAnetwork containingdata fromone specific source, forexample, CURATED data, which includes information from all expert curateddatabasesinourdatabase(CGI,ClinGen,GenomicsEngland,UniProt,CTD_human,PsyGeNET,andOrphanet),selecttheSourceofinterest(CURATED),andpressthebuttonCreateNetwork.TheGDAnetworkcontains20,884nodesand151,277edges.ApplyaCytoscapelayoutalgorithmtogeneratetheviewofchoice,e.g.selectthelayoutOrganic.Formore information on the layout styles, seehttp://apps.cytoscape.org/apps/yfileslayoutalgorithms. Once the network isobtained,specificinformationonthenodesandtheirrelationshipscanbeexplored
11
usingtheCytoscapeTablePanel(atthebottom,right)(Figure7).
Figure7:TheCURATEDGDAnetwork
ii. Generatevariant-diseasenetworksToobtainaVDAnetworkcontainingdatafromonespecificsource,forexample,theGeneticAssociationDatabase,selecttheVariantDiseaseNetworktab,thesourceofinterest(UNIPROT),andpressthebuttonCreateNetwork.TheresultsareshowninFigure8.Thenetworkiscomposedof26,145nodesand182,304edges.
Figure8:TheUNIPROTVDAnetwork
12
iii. CreatenetworksbyDisGeNETassociationtypeThe DisGeNET App allows searching by different categories of gene-diseaseassociationtypes,asdescribedbytheDisGeNETassociationtypeontology(Figure6).TocreateaGDAnetworkfromCURATEDdatarestrictedtoassociationtype“CausalMutation”,selecttheSource,(forinstanceCURATED).ChooseCausalMutation,fromtheAssociationTypedropdownmenu,andpressCreateNetwork.TheGDAnetworkobtainedcontains5876nodesand6560edges(Figure9).Could you find the gene in the network carrying CausalMutations for the largestnumberofdiseases?Hint:ordergenesbythecolumnnrAssociatedDiseases
Figure9:TheCURATEDGDAnetworkforCausalMutations
iv. CreatenetworksbydiseaseclassNetworkscanalsobecreatedbyrestrictingtoaspecifictypeofMeSHdiseaseclass.ThediseaseclassificationisbasedontheDiseasesbranch(C)andthreecategories(F01, F02, and F03) of the Psychiatry and Psychology Branch (F) of the MeSHhierarchy.TogenerateanetworkofANIMALMODELSdata,containingonlyNutritionalandMetabolicDiseases,SelecttheSource(ANIMAL_MODELS),andchoosetheDiseaseClass(NutritionalandMetabolicDiseases)fromtheDiseaseClassdropdownmenu.Then,press thebuttonCreateNetwork (Figure10).ThisGDAnetworkhas1510nodesand3087edges.Which is thediseasewith the largestnumberofassociatedgenes in thisnetwork?(Hint:usethenetworkanalyzerutilityfromCytoscapeToolsMainMenu)
13
Figure10:TheGDAnetworkforNutritionalandMetabolicDiseasesinanimalmodelsofdisease
14
v. Createnetworksbygene,disease,orvariantThe Search option included in the DisGeNET control panel can also be used togenerate different types of networks for a single disease, gene, or variant. ThissearchmayalsobefilteredbySource,AssociationType,DiseaseClass,andScore.Forexample,thefigure15showstheresultsforthesearchbyasinglegene,themethyl-CpGbindingprotein2gene(MECP),filteringbydiseaseclassMentalDisordersinthecurateddatainDisGeNET(Figure11).
Figure11:GDAnetworkcontainingthemetabolicdiseasesassociatedwiththemethyl-CpGbindingprotein2
gene(MECP2)inDisGeNETCURATED.
vi. MultipleentitysearchintheDisGeNETAppTheSearchoptioncanalsobeusedto:● Generateanetworkaroundagroupofdiseasesorgenes,matchingakeyword● Generate a network for a list of genes, diseases or variants, and their
combinations
a. SearchbyadiseasematchingakeywordTobuildaGDAnetworkcontainingDisGeNETdatafromCTD(humandata)forallthetypesofAlzheimerDiseaseinthisdatabase,selecttheSource,CTD_humanandwrite in the disease Search box ‘*alzheimer*’ to create a network based on thiskeywordinthediseasename.Then,pressCreateNetwork(Figure12).HowmanydifferentsubtypesofAlzheimer’sDiseaseareincludedinDisGeNET(CTDhumandata)?HowmanygenesareassociatedwithAlzheimer'sDisease(C0002395)?
15
Figure12:GDAnetworkcontainingallsubtypesofAlzheimerDiseaseinCTD,humandata
Thisnetworkcanbefurtherfilteredbygene.Forexample,togeneratethenetworkofalltheAlzheimersubtypes,andtheamyloidbetaprecursorprotein(APP)gene,typeAPPintheGeneSearchBox,andpressCreateNetwork(Figure13).AnetworkwithallsubtypesofAlzheimerassociatedtotheAPPgeneiscreated.Each edge in the GDANetwork represents the supporting evidence for a gene-diseaseassociationuniquelydefinedbythesource,oneassociationtype,andonepublication. The colour of each edge distinguishes the association type.Use theEdgeTableintheTablePaneltoexploretheevidenceforeachassociation.WhatisthescoreoftheAPP-Alzheimer’sDiseaseassociation?
Figure13:TheGDAnetworkfortheamyloidbetaprecursorprotein(APP)andallsubtypesofAlzheimerDisease
16
b. SearchbyalistofgenesTobuildaGDAnetworkassociatedwithalistofgenes,enterintheGeneSearchBoxthelistofgenesseparatedby“;”andpressCreateNetwork.Figure14showstheresults of querying DisGeNET CURATED data for a list of potassium channels:KCNE1;KCNE2;KCNH2;KCNG1.Howmanygenesappearinthenetwork?
Figure14:TheCURATEDGDAnetworkforthegenesKCNE1;KCNE2;KCNH2
c. Searchbyalistofvariants
AsimilarprocedurecanbefollowedtoqueryDisGeNETforthediseasesassociatedwith a list of variants. Go to the Variant Disease Network tab, select source“CURATED”andtypethevariantsseparatedby“;”intheSearchVariantandthenpressCreateNetwork.Figure15showstheresultsofqueryingthefollowinglistofvariants:rs121907927;rs373661718;rs121907927;rs780356070;rs200015827;rs121907928;rs757259413;rs121907919;rs121907928
Figure15:TheCURATEDVDAnetworkforvariants
rs121907927;rs373661718;rs121907927;rs780356070;rs200015827;rs121907928;rs757259413;rs121907919;rs121907928
Noticethatinthisnetwork,thegenesassociatedwiththevariantscanbedisplayed
17
by clickingon thebox “Showassociatedgenes” in theDisGeNET control panel(Figure16).Applytheradiallayouttoobtainasimilarvisualization.
Figure16:TheCURATEDVDAnetworkforvariants
rs121907927;rs373661718;rs121907927;rs780356070;rs200015827;rs121907928;rs757259413;rs121907919;rs121907928andtheirassociatedgenes
d. Searchbyalistofdiseases
Go to theGeneDiseaseNetwork taband in the SearchDiseasebox,enter the listfollowinglistofdiseases:C0268337;C0268342;C0268336;C0268335;C0268338;C0013720.Filterbyscore>0.3,andthenpressCreateNetwork(Figure17).
Figure17:TheGDAfordiseasesC0268337;C0268342;C0268336;C0268335;C0268338;C0013720withscore>=0.4
TheinputtermsintheSearchDiseaseboxcanbecombined,usingUMLSCUIs,diseasenames,oraregularexpression.Forexample,thefollowinglistC0002395;Schizophrenia;*Parkinson*,willretrievetheGDAnetworkforAlzheimer’sDisease,Schizophrenia,andallsubtypesofParkinsondiseaseinDisGeNET.SeetheresultsofthesearchinDisGeNETCURATEDdatainFigure18.
18
Figure18:TheCURATEDGDAnetworkfordiseases:C0002395;Schizophrenia;*Parkinson*withscore>=0.5
NoticethattheappreturnsanetworkifatleastoneoftheentitiesinthequeryisincludedinDisGeNETdata.
19
b. Advancedfunctionsi. Colouringnodesbydiseaseclass
TheDisGeNETappallowscolouringthenodesofanetworkaccordingtotheMeSHdiseaseclassification.Inthecaseofdiseases,thecolouringisbasedonthediseaseclassannotationofthediseasenode.Diseasesbelongingtomorethanonediseaseclasswillappearwithmorethanonecolour.Inthecaseofgenesandvariants,theMeSHdiseaseclassisassignedfromthedisease(s)associatedtothem.Forexample,intheGeneDiseaseNetworktab,searchforINSR;INS;LEP;LEPRinthegenesearchbox,usingthesourceANIMALMODELS,AnyassociationtypeandAnyDiseaseClass,andscore>=0.Press“CreateNetwork”(Figure19).
Figure19:GDAnetworkforINSR;INS;LEP;LEPRinanimalmodels(score≥0).
Oncethenetworkiscreated,clickonColournodeswithdiseaseclass,andallthenodesofthediseasesinthenetworkwillbecolouredaccordingtotheirMeSHdiseaseclasses,whilethegeneswillbecolouredaccordingtotheclassesoftheirassociateddiseases.ThecolourforeachdiseaseclassisdisplayedintheDiseaseClassLegendshownattherightside.Theborderofthenodeskeepstheoriginalcolourindicatingifitrepresentsadisease,ageneoravariant.NoticethatifsomediseasesdonothaveMeSHdiseaseclass,theywillbecolouredingray(Figure20).
20
Figure20:GDAnetworkforINSR;INS;LEP;LEPR,inanimalmodels,colouredaccordingtoMeSHdiseaseclasses.
ii. DisGeNETexpandTheDisGeNETexpandfunctionallowstoretrieveallthedataavailableforaspecificnodeinanetwork.Theexpandfunctionisparticularlyusefulwhenaninitialsearchonasingledatabase(e.g.UniProt)isperformed,andoncethenetworkisobtained,youwanttoknowifthereareotherassociationsinthewholeDisGeNETdatabase(namelyALLdataset).ThefunctioncanbeusedtocreateanewDisGeNETnetworkusingtheselectednode(s)forthequeryortoexpandtheexistingnetworkwithnodesandedgesfoundinDisGeNETALL.Currently,thefunctionoffersthefollowingoptions:
- OnaGDAnetwork,appliedtoagene,itwillretrievethediseasesassociatedwiththegenefromALLdata.
- OnaGDAnetwork,appliedtoadisease,itwillretrievethegenesassociatedwiththediseasefromALLdata.
- OnaVDAnetwork,appliedtoavariant,itwillretrievethediseasesassociatedwiththevariantfromALLdata.
- OnaVDAnetwork,appliedtoadisease,itwillretrievethevariantsassociatedwiththediseasefromALLdata.
- OnaVDAnetwork,appliedtoageneassociatedwithavariant,itwillcreateanewGDAnetworkwiththediseasesassociatedwiththegenefromALLdata.
ForanexampleonhowtoapplytheDisGeNETexpandfunctiontotheGDAnetworkforgeneAlzheimer’sDiseaseandAPOEfromCURATEDdata,seeFigure25.Right-clickonthenodeofinterest,inthiscasetheAPOEnode,andgoto“Apps->DisGeNET->Expand”.Youhavethentoselecteither“Expandcurrentnet”or“Createnewnet”.The“SetParameters”boxwillappeartoselectthedatabase(Figure21).
21
Figure21:Theexpandfunctionmenu,onthenodeviewcontextmenuforthegeneAPOEandAlzheimer’s
Disease.
The“SetParameters”boxallowsfilteringtheexpandednetworkbydatasource,byscore,andbyattributessuchasthediseaseclass,andtheassociationtype.Selectthedatabase“CURATED”,forexample,andDiseaseclassequalto“NervousSystemDiseases”.TheresultsareshowninFigure22.
Figure22:TheSetParametersBoxassociatedtotheexpandfunction
22
Figure23:TheexpandednetworkforgeneAPOE(CURATEDdata),anddiseaseclass“NervousSystemDiseases”,
displayedusingtheorganiclayout.
iii. DisGeNETLinkouts
1. LinkOutsforthenodesInordertogetmoreinformationaboutaspecificgene,diseaseorvariantfromaDisGeNETnetwork,youcanusetheLinkOutfunctiontothereferencedatabases(NCBIforGenes,LinkedLifeDataforUMLSCUIsanddbSNPforSNPs).TheLinkOutfunctionisavailableinthenodecontextmenu,whichcanbeaccessedbyright-clickingaselectednode(Figure24Figure23).Theavailablelinkoutsarespecifictothenodetype.
Figure24:TheLinkOutfunctionmenu,onthenodeviewcontextmenu,forthegeneAPOE.
23
2. LinkOutsfortheedgesTheusercanalsoexplorethepublicationsupportingtheGDA,orVDA,byclickingontheedgeconnectingthepairofinterest,andaccessingtheLinkOuttoNCBIPubmed(Figure25)
Figure25:TheLinkOutfunctionmenu,ontheedgeviewcontextmenu,fortheassociationbetweenAlzheimer’s
DiseaseandthegeneAPOE.
iv. AnnotatingforeignnetworkswithDisGeNETdata
TheDisGeNETCytoscapeAppalsoallowsannotatingnetworksgeneratedbyotherapplicationsoruploadedbytheuser.Toaccessthefunction,rightclickonanemptyspaceofthenetworkview,toopenthenetworkcontextmenu.Goto"App->DisGeNET"andselectthedesiredfunction:
AnnotateforeignproteinnetworkswithDisGeNETdiseasesItispossibletoannotateaforeignnetworkcontaininggenesusingoneofthefollowinggeneidentifiers:
- GeneNCBIidentifier- Genesymbol
Toannotateanexternalnetwork,wewilluseasanexampleoneofthenetworksintheCytoscapeStarterPanel.TodisplaytheStarterPanel,gotothemainCytoscapemenu,andclick"View->showStarterPanel".Clickonthenetwork"TCGAColorectalCancer"toopenthesession.Chooseasetofnodesofthenetwork(intheexampleweselectedthefirstneighborsofthegenePLK1)andgenerateanewnetwork("File->New->Network->FromSelectednodes,alledges").Thisisdonetoreducethesizeoftheresultingnetwork.Afterthenewnetworkiscreated,rightclickonanemptyspaceofthenetworkview,toopenthenetworkcontextmenu.Goto"App->DisGeNET->AnnotategeneswithDisGeNET->Genes->Annotategeneswithdiseasesfromtheselectedsource"(Figure26).Thatwilldisplaythe"SetParameters"Box(Figure27).
24
Figure26:SubNetworkextractedfromtheexamplesessionAfinityPurification.cys
Noticethatinthiscase,"SetParameters"Boxincludesanewfield,toselectthenameofthecolumncontainingthegeneidentifier(NCBIidentifierorSymbol).Additionally,youmightselectthesourceoftheannotation,arangeofscore,orthediseaseclass.Intheexample,weusedCURATEDand“Neoplasms”.
Figure27:TheSetParametersBoxassociatedtotheAnnotateexternalnetworksfunction
AnewGDAnetworkwillbecreatedwiththediseasesassociatedwiththegenesinthenetworkaccordingtotheselectedsource(intheexample,datafromCURATEDsources).TheresultsareshowninFigure28.
25
Figure28:TheCURATEDGDAnetworkforthegenesfromtheexternalnetwork
AnnotateforeignvariantnetworkswithDisGeNETdiseasesSimilarly,anetworkcontainingvariantsidentifiedbythedbSNPidentifierscanbeannotated.Forexample,copythetableinAnnexes,containingthevariantswiththeirchromosomes,takenfromtheSupplementarytable4fromreference(5),apublicationdescribingaGWASthatidentified143riskvariantsfortype2diabetes.Saveitasatxtfileinadocument.Inordertouncoverwhichofthereportedvariantshavealreadybeenassociatedtoothertraits,forexampleintheGWASCatalog,followthesesteps:CreateanewnetworkfromthefilebyclickinginthemainmenuofCytoscape“File->Import->Network->file”.TheresultshouldbeanetworkliketheoneinFigure29.Then,rightclickonanemptyspaceofthenetworkview,toopenthenetworkcontextmenu.Goto"App->DisGeNET->Variants->Annotatevariantswithdiseasesfromtheselectedsource".The“Setparameters”Boxwillbeshown,toselectthenameofthecolumnwiththedbSNPidentifier,andthesourceofthedata.TheresultsoftheannotationusingGWAScatalogdataareshowninFigure16.
26
Figure29:TheannotatedVDAnetworkfromDisGeNETAllsources
27
Annotatingadrug-targetnetworkwithDisGeNETdiseasesTheDisGeNETCytoscapeAppalsoallowsannotatingnetworksgeneratedbyotherapplicationsoruploadedbytheuser.Wewillgenerateadrug-targetnetworkforthedrug5-fluorouracilfromtheSTITCHdatabaseandannotatethetargetswithdiseaseinformation.
1. GototheSTITCHdatabase(http://stitch.embl.de/)andsearchthetargetsof5-fluorouracilinhumans.
2. Retrievethenetworkofdrugtargetsselectingthesources:“Experiments”,“Databases”.Thisshouldresultinanetworkof5-FUand10proteins.
3. DownloadthenetworkfileinTSVformatandimportitinCytoscapeasadrug-targetnetwork.Thenetworkfileshouldlooklikethis,withoneinteractionperrowofthetable:
node1 node2 combined_score
CASP8 CASP3 0.998 DPYD 5-fluorouracil 0.995 BAX TP53 0.993 UPP1 5-fluorouracil 0.992 UPP2 5-fluorouracil 0.963 TYMS 5-fluorouracil 0.960 BAX 5-fluorouracil 0.900 CYP2A6 5-fluorouracil 0.900 TP53 5-fluorouracil 0.900 CASP8 5-fluorouracil 0.900 CASP3 5-fluorouracil 0.900 DHFR 5-fluorouracil 0.900 UPP1 CYP2A6 0.899 UPP2 DPYD 0.899 DPYD UPP1 0.899 DHFR TYMS 0.899 DPYD CYP2A6 0.899 UPP2 CYP2A6 0.899 DHFR TP53 0.575 CASP3 BAX 0.569 CASP8 TP53 0.569 CASP8 BAX 0.569
ImportthenetworkbyclickingonFile→Import→Network→File.Note:becarefultoselectthepropercolumnsasnodesandedgesinyournetwork.OncethedatahasbeenimportedinCytoscape,adrug-targetnetworkfor5-FUwillbeobtainedanddisplayedinCytoscape(Figure30):
28
Figure30:Importednetworkrepresentingthedrugtargetsof5-FU
Questions:
● Howmanytargetsareinthe5-FUnetwork?● Whatisthedegreeofthe5-FUnode?
Next,annotatethetargets(genes)withcurateddiseaseinformationfromDisGeNET.Toaccesstheannotatefunction,rightclickonanemptyspaceofthenetworkview,toopenthenetworkcontextmenu.Goto"App->DisGeNET->Genes->Annotategeneswithdiseasesfromtheselectedsource"(Figure31).
Figure31:Annotatingexternalnetworkswithdiseases
Then, inthe"SetParameters"Boxchoosethecolumn“name”.Asaresult,anewGDAnetworkforthetargetsof5-FUwillbegenerated(Figure32).
29
Figure32:GDAnetworkofthetargetsof5-FU
Generateasub-networkofGDAsforTYMS,onethe5-FUtargets.Questions:
● WhatisthedegreeofTYMSintheGDAnetwork?● Whatclassesofdiseasesarerepresented intheTYMSGDAnetwork?Tip:
inspect theMeSH disease classes in the Node Table or by colouring thenetworkbydiseaseclass
c. DisGeNETautomationTheCytoscapeAutomationisasetoftoolsthatallowsuserstocreateworkflowsexecutedentirelywithinCytoscapeorbyexternaltools(suchasRStudioorJupyter).TheDisGeNETAutomationAPIallowsqueryingtheCytoscapeDisGeNETappfromanexternalenvironmentsuchasR,andPython,usingRESTcalls.TheDisGeNETappincludesanautomationmodulewithasetofRESTendpoints.ThedocumentationoftheendpointsisavailableattheSwaggerpageofCytoscapethatcanbeaccessedbygoingtotheCytoscapemenuandclicking“Help->Automation->CyRestAPI”.TheAPIisaccessibledirectlythroughtheSwaggeruserinterfacewithinCytoscapeorbyusinganyREST-enabledclient.Figure33showsalltheavailableendpoints,includingthoseavailablebydefaultthroughCytoscape,andtheoneprovidedbytheCytoscapeappsintheuserinstallationofCytoscape.Weprovideexamplesofscriptsathttp://disgenet.org/app.
30
Figure33:TheswaggerinterfacepagefortheCyRestAPI.
ExpandingtheDisGeNET-AutomationmenuwillshowtheRESTendpointscorrespondingtoDisGeNET(¡Error!Noseencuentraelorigendelareferencia.)
Figure34:TheavailableDisGeNETRESTendpoints.
Important:Inordertoruntheautomationscripts,youfirstneedtolaunchCytoscape.
31
i. UsingtheDisGeNETautomationinRTousetheDisGeNETautomationinR,weprovideanRscriptthatcanbefoundathttp://www.disgenet.org/static/disgenet_ap1/files/current/disGeNETAutomation.RTogenerateaVDAnetwork,youcanusethefollowingRline:variantDisResult<-disgenetRestCall("variant-disease-net",variantDisParams)Previously,youneedtodefinetheparametersofyoursearch,forexample:
variantDisParams<-list(source="UNIPROT",assocType="GeneticVariation",diseaseClass="Neoplasms",diseaseSearch="",geneSearch="",variantSearch="",initialScoreValue="0.0",finalScoreValue="1.0",showGenes="true"
)ByexecutingthedisgenetRestCallfunction,theresultsofthequerywillbedisplayedinCytoscape.ThefunctiondisgenetRestCall(netType,params)createstheurlusingthefunctiondisgenetRestUrlandexecutestheRESTcalltothedesiredRESTpointandwiththenetworkparametersprovidedbytheuser,thefunctionreturnstheresultsinalistcontainingamessagewiththeresultoftheoperation,andalistcontaininginformationofthenetwork.
- netType:oneofthefollowing:gene-disease-netorvariant-disease-net- netParams:theonlyrequiredfieldisthesource.(seeexamplebelow)
ExampleofnetParamslistforthegene-diseasenetwork.
geneDisParams<-list(source="UNIPROT",assocType="GeneticVariation",diseaseClass="Neoplasms",diseaseSearch="",geneSearch="",initialScoreValue="0.0",finalScoreValue="1.0"
)ThefunctiondisgenetRestUrl(netType,host,port,version)createstheRESTurlusingthefollowingparameters:
- netType:oneofthefollowing:gene-disease-netorvariant-disease-net- host:thehost/ipwheretheRESTpointisplaced(bydefault,localhost).- port:theportinwhichtheRESTpointislistening(bydefault,1234).- version:theversionoftheRESTpoint,shouldmatchtheversionofyour
DisGeNETApp(bydefault,thelatestversion).ThisfunctionisreadytobeusedwiththedefaultCytoscapeautomationsetup
32
insidethedisgenetRestCallfunction.
ii. UsingtheDisGeNETautomationinpython
TousetheDisGeNETautomationinPython,anexampleofscriptcanbefoundathttp://www.disgenet.org/static/disgenet_ap1/files/current/disgenet-automation.pyThescriptcontainsfourfunctionsthatallowcallingtheautomationmodule,andsimplifytheaccessofthedata.ThefunctiondisgenetRestUrl(netType,host,port,version)createstheRESTurlusingthefollowingparameters:
- netType:oneofthefollowing:gene-disease-netorvariant-disease-net- host:thehost/ipwheretheRESTpointisplaced(bydefault,localhost).- port:theportinwhichtheRESTpointislistening(bydefault,1234).- version:theversionoftheRESTpoint,shouldmatchtheversionofyour
DisGeNETApp(bydefault,thelatestversion).ThisfunctionisreadytobeusedwiththedefaultCytoscapeautomationsetupinsidethedisgenetRestCallfunction.ThefunctiondisgenetRestCall(netType,params)createstheurlusingthefunctionaboveandexecutestheRESTcalltothedesiredRESTpointandwiththegivennetparameters,thefunctionreturnstheresultsinalistcontainingamessagewiththeresultoftheoperation,andalistcontaininginformationofthenetwork.
- netType:oneofthefollowing:gene-disease-netorvariant-disease-net- netParams:theonlyrequiredfieldisthesource.(seeexamplebelow)
ExampleofnetParamsforthegene-diseasenetwork.
geneDisParams={ "source":"UNIPROT", "assocType":"GeneticVariation", "diseaseClass":"Neoplasms", "diseaseSearch":"", "geneSearch":"", "initialScoreValue":"0.0", "finalScoreValue":"1.0"}
ForfunctionsprintHashandprintOperationResultseethefunctiondocumentationtogetmoreinformation.
33
4. TheSQLitedatabaseTheDisGeNETappqueriesalocalversionofDisGeNETdatathatisdownloadedasanSQLitedatabase.EachversionoftheAppcorrespondstoaspecificversionoftheSQLitedatabase.ThediagramofthedatacontainedintheSQLitedatabase(version7.0)correspondingtothecurrentversionoftheApp(7.x),canbeexploredinFigure35.
Figure35:TherelationalschemaoftheDisGeNETSQLitedatabase(version7.0)
34
5. ThenodeandedgetablesintheCytoscapeTablePanel
Intables1-3,weshowabriefdescriptionofeachfieldinthenodeTable,andedgeTable,inthe"TablePanel"ofCytoscapeforthedifferentnetworksgeneratedbytheDisGeNETapp.Formoreinformation,visithttp://www.disgenet.org/dbinfo.Table 1: Edge attributes in the gene-disease network Name Description
interaction Uniqueidentifierforthisassociation.
source Originaldatabaseinwhichthisgene-diseaseassociationisreported.
associationType AssociationtypeoftheGDAaccordingtotheDisGeNETassociationtypeontology(seesection2.4).
sentence Arepresentativesentencefromthepublicationdescribingtheassociationbetweenthegeneandthedisease(Ifarepresentativesentenceisnotfound,weprovidethetitleofthepaper).
pmid PubMedidentifierofthepublicationsupportingthereportedgene-diseaseassociation,ifavailable.
score TheDisGeNETGDAscorerangesfrom0to1,andtakesintoaccountthenumberandtypeof sources (level of curation, model organisms), and the number of publicationssupportingtheassociation.
EI TheEvidenceIndexfortheGDA,thatindicatestheexistenceofcontradictoryresultsinpublication.
EL TheEvidenceLevelmeasuresthestrengthofevidenceofagene-diseaserelationship
Table 2: Node attributes in the gene-disease network Name Description
name Nameofthenode,correspondingtoNCBIidentifierforgenes,andUMLSCUIsfordiseases
nodeType Thetypeofnode(geneordisease).
diseaseId UMLS®CUIofthedisease.
diseaseName Nameofthedisease.
diseaseClass ListofdiseaseclassidentifiersaccordingtotheMeSHhierarchy.
diseaseClassName ListofdiseaseclassesaccordingtoMeSHhierarchy.
geneId NCBIidentifierofthegene.
35
geneName OfficialSymbolofthegene.
GeneDSI TheDiseaseSpecificityIndexofthegene
GeneDPI TheGenePleiotropySpecificityIndexofthegene
GenepLI Theprobabilityofageneofbeingloss-of-functionintolerant
nrAssociatedDiseasesnrAssociatedGenes
Numberofassociateddiseasesorgenes(numberoffirstneighboursofthenode).
styleName Nameofgeneordisease,neededfortheDisGeNETvisualstyle.
styleSize Numberoffirstneighboursofthenode,neededfortheDisGeNETvisualstyle.
Note:IDandcanonicalNameareinternaluniqueidentifiersusedbythesystem. Table 3: Edge attributes in the variant-disease network Name Description
interaction Uniqueidentifierforthisassociation.
source Originaldatabaseinwhichthisgene-diseaseassociationisreported.
associationType AssociationtypeoftheVDAaccordingtotheDisGeNETassociationtypeontology(seesection2.4).
pmid PubMedidentifierofthepublicationsupportingthereportedvariant-diseaseassociation,ifavailable.
sentence Arepresentativesentencefromthepublicationdescribingtheassociationbetweenthegeneandthedisease(Ifarepresentativesentenceisnotfound,weprovidethetitleofthepaper).
score The DisGeNET VDA score ranges from 0 to 1, and takes into account the number ofsources,andthenumberofpublicationssupportingtheassociation.
EI The Evidence Index for the VDA indicates the existence of contradictory results inpublication.
36
6. AnnexesTablewiththeexamplevariantstoannotateexternalnetwork
rsid Chromosome
rs9429103 1
rs10890427 1
rs111790575 1
rs1126742 1
rs13375749 1
rs211723 1
rs211710 1
rs7552404 1
rs11161430 1
rs478093 1
rs61817724 1
rs6684114 1
rs1260326 2
rs6719753 2
rs3738848 2
rs10197755 2
rs796419162 2
rs13409366 2
rs10201159 2
rs13384756 2
rs13410232 2
rs13431529 2
rs10189885 2
rs6546869 2
rs6744398 2
rs7601356 2
rs1047891 2
rs715 2
rs887829 2
37
rs6742078 2
rs6804368 3
rs6800284 3
rs10010582 4
rs358236 4
rs1843481 4
rs1481012 4
rs141471965 4
rs4253255 4
rs4253328 4
rs37369 5
rs248386 5
rs27044 5
rs11386832 5
rs10075801 5
rs2405522 5
rs1801020 5
rs75077631 5
rs2545801 5
rs1165213 6
rs2817188 6
rs1165153 6
rs1165192 6
rs72939920 6
rs12208357 6
rs662138 6
rs316019 6
rs10242455 7
rs4921913 8
rs35570672 8
rs4873099 8
rs2759009 9
rs36034585 9
rs1171616 10
rs1171615 10
38
rs61886778 10
rs603424 10
rs10766469 11
rs174528 11
rs7943728 11
rs174533 11
rs174535 11
rs174536 11
rs61896141 11
rs102274 11
rs174545 11
rs174548 11
rs174549 11
rs174554 11
rs174555 11
rs174560 11
rs174561 11
rs174564 11
rs174566 11
rs5792235 11
rs99780 11
rs174580 11
rs151042642 11
rs113570042 11
rs11820589 11
rs964184 11
rs34265203 12
rs10774021 12
rs11613331 12
rs17329885 12
rs4149056 12
rs11045832 12
rs1871395 12
rs58310495 12
rs59205959 12
39
rs2939302 12
rs1799958 12
rs3916 12
rs7141433 14
rs17101394 14
rs8008068 14
rs11158671 14
rs8014023 14
rs16952714 15
rs80123226 15
rs261290 15
rs2043085 15
rs1532085 15
rs1077835 15
rs1077834 15
rs2070895 15
rs261334 15
rs4775633 15
rs28582913 16
rs7208714 17
rs4330 17
rs4335 17
rs4343 17
rs4362 17
rs1799763 17
rs922442 19
rs8012 19
rs7247977 19
rs62128825 19
rs2547239 19
rs2547237 19
rs212099 19
rs641738 19
rs8736 19
rs2540645 22
40
rs131813 22
rs131793 22
41
7. CitationIfyouareusingDisGeNETforyourownresearch,pleasecite:
❖ TheBrowser,andthecurrentversionofthedata:
Janet Piñero, Juan Manuel Ramírez-Anguita, Josep Saüch-Pitarch, FrancescoRonzano,EmilioCenteno,FerranSanz,LauraIFurlong.TheDisGeNETknowledgeplatformfordiseasegenomics:2019update(2019)https://doi.org/10.1093/nar/gkz1021
❖ DisGeNET-RDF:Queralt-Rosinach N, Piñero J, Bravo À, Sanz F, Furlong LI. DisGeNET-RDF:HarnessingtheInnovativePoweroftheSemanticWebtoExploretheGeneticBasis of Diseases. Bioinformatics. Bioinformatics (2016) doi:10.1093/bioinformatics/btw214
❖ TheCytoscapeApp:
PiñeroJ,SaüchJ,SanzF,FurlongLI.TheDisGeNETCytoscapeApp:exploringandvisualizingdiseasegenomicsdata.UnderreviewBauer-MehrenA,RautschkaM,SanzF,FurlongLI.DisGeNET:aCytoscapepluginto visualize, integrate, search and analyze gene-disease networks.Bioinformatics.(2010)doi:10.1093/bioinformatics/btq538Bauer-MehrenA,BundschusM,RautschkaM,MayerMA,SanzF,FurlongLI:Gene-diseasenetworkanalysisrevealsfunctionalmodulesinMendelian,complexandenvironmentaldiseases.PLoSONE(2011)doi:10.1371/journal.pone.0020284.
❖ Tocitespecificdata:
Gene-diseaseassociationdataretrievedfromDisGeNETv7.0(http://www.disgenet.org/),IntegrativeBiomedicalInformaticsGroup,GRIB/IMIM/UPF.[Month,yearofdataretrieval].
42
8. References
1. JanetPiñero,JuanManuelRamírez-Anguita,JosepSaüch-Pitarch,
FrancescoRonzano,EmilioCenteno,FerranSanz,LauraIFurlong.TheDisGeNETknowledgeplatformfordiseasegenomics:2019update(2019)https://doi.org/10.1093/nar/gkz1021
2. JanetPiñero,ÀlexBravo,NúriaQueralt-Rosinach,AlbaGutiérrez-Sacristán,JordiDeu-Pons,EmilioCenteno,JavierGarcía-García,FerranSanz,andLauraI.Furlong.(2016)DisGeNET:acomprehensiveplatformintegratinginformationonhumandisease-associatedgenesandvariants.Nucl.AcidsRes.doi:10.1093/nar/gkw943
3. Piñero,J.,Queralt-Rosinach,N.,Bravo,A.,Deu-Pons,J.,Bauer-Mehren,A.,Baron,M.,Sanz,F.andFurlong,L.I.(2015)DisGeNET:adiscoveryplatformforthedynamicalexplorationofhumandiseasesandtheirgenes.Database,2015,bav028–bav028
4. Bauer-Mehren,A.,Rautschka,M.,Sanz,F.andFurlong,L.I.(2010)DisGeNET:aCytoscapeplugintovisualize,integrate,searchandanalyzegene-diseasenetworks.Bioinformatics,26,2924–6
5. Bauer-Mehren,A.,Bundschus,M.,Rautschka,M.,Mayer,M.A.,Sanz,F.andFurlong,L.I.(2011)Gene-DiseaseNetworkAnalysisRevealsFunctionalModulesinMendelian,ComplexandEnvironmentalDiseases.PLoSOne,6,13
6. XueAWuYZhuZZhangFKemperKet.al.(2018)Genome-wideassociationanalysesidentify143riskvariantsandputativeregulatorymechanismsfortype2diabetes.NatureCommunications,vol:9(1)pp:2941
43
9. Contact©2010-2021,IntegrativeBiomedicalInformaticsIntegrativeBiomedicalInformaticsGroupResearchUnitonBiomedicalInformatics-GRIBIMIM-HospitaldelMarMedicalResearchInstituteUniversitatPompeuFabra(UPF)Dr.Aiguader8808003Barcelona,Spainphone:+34933160521fax:+34933160550web:http://grib.imim.es/research/integrative-biomedical-informatics/IfyouhavequestionsorcommentsaboutDisGeNETdata,thedatabase,thewebsite,theplugin,thebrowser,theRDFrepresentationorthedownloads,pleasecontactusat:support(at)disgenet(dot)org
44
10. FundingIMI-JUresourcesofwhicharecomposedoffinancialcontributionfromthe
EU-FP7[FP7/2007–2013]andEFPIAcompaniesinkindcontribution[116030toTransQST, 777365 to eTRANSAFE], and the EU H2020 Programme 2014–2020[676559 to Elixir-Excelerate.]; Project001-P-001647 - Valorisation of EGA forIndustryandSocietyfundedbytheEuropeanRegionalDevelopmentFund(ERDF)andGeneralitatdeCatalunya;AgènciadeGestiód’AjutsUniversitarisideRecercaGeneralitat de Catalunya [2017SGR00519]. The Research Programme onBiomedicalInformatics(GRIB)isamemberoftheSpanishNationalBioinformaticsInstitute(INB),fundedbyISCIIIandFEDER(PRB2-ISCIII[PT13/0001/0023,ofthePE I+D+i2013–2016]).TheDCEXS isa ‘UnidaddeExcelenciaMaríadeMaeztu’,fundedbytheMINECO[MDM-2014-0370].
45
11. LicenseTheDisGeNETappisdistributedundertheGNUGPL3.0license.MoredetailsabouttheGNUGeneralPublicLicense3.0isavailablehere.TheDisGeNETdatabaseismadeavailableundertheAttribution-NonCommercial-ShareAlike4.0InternationalLicensewhosetextcanbefoundhere:https://creativecommons.org/licenses/by-nc-sa/4.0/.IfDisGeNETisincorporatedintootherworks,weaskthatDisGeNETisproperlycited(seethecitationguidelines),andthattheversionnumberofDisGeNETisclearlydisplayed.http://disgenet.org/legal
12. AboutthisdocumentThiswork is licensedunder theCreativeCommonsAttribution-NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, visithttp://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to CreativeCommons,444CastroStreet,Suite900,MountainView,California,94041,USA.