Chung, K Et Al - Structural And Molecular Interrogation of Intact Biological Systems_ Nature 2013
PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI...
Transcript of PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI...
![Page 2: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/2.jpg)
Outline● Summary of 2012/2013 activities and achievements
● MIRIAM and identifiers.org
● MITAB 2.7 and MIQL 2.7
● Clustering
● New PSICQUIC reference implementation
● PSICQUIC view update
● Data Distribution Best Practices
![Page 3: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/3.jpg)
Summary of 2012/2013 activities and achievements
![Page 4: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/4.jpg)
PSICQUIC Hackathon● 28th May – 1st June 2012
● 10 developers from 7 different partners● BioJS, Cytoscape, DIP, InnateDB, IntAct, MatrixDB, MINT, MPIDB ● http://code.google.com/p/psicquic/wiki/PSICQUICHackathon2012
● 2 working groups● SOLR team :
● reference implementation● indexing MITAB 2.5, 2.6 and 2.7 using SOLR● MIQL 2.7 ● XML indexing and PSICQUIC webservices improvements
● Client team : ● PSICQUIC view visualization: table, network and search● Cytoscape plugin
![Page 5: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/5.jpg)
2012/2013 releases● MITAB 2.7
http://code.google.com/p/psimi/wiki/PsimiTab27Format● MIQL 2.7
http://code.google.com/p/psicquic/wiki/MiqlReference27
● PSICQUIC reference implementation http://code.google.com/p/psicquic/wiki/PsicquicSpec_1_3_Rest● LUCENE 1.2.3● SOLR 1.3.9
● PSI-MI java librarieshttp://code.google.com/p/psimi/downloads/list● psi25-xml parser 1.8.3● psimitab parser 1.8.3● psi25-xml to RDF/Biopax converter 1.8.3● Calimocho 2.5.0● Calimocho to XGMML converter 2.5.0.3
● PSICQUIC-view http://www.ebi.ac.uk/Tools/webservices/psicquic/view/main.xhtml
![Page 6: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/6.jpg)
PSICQUIC growth+ 25 millions binary interactions
since 2012
+ 2 services since 2012 => total of 28 service and one more in progress (Flybase)
![Page 7: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/7.jpg)
Work in progress...
● PSICQUIC/MITAB 2.7 publication submitted and in review
● PSICQUIC view and download all button
● BioJS : new javascript components for molecular interaction visualization
● Clustering improvements (new web interface, …)
● JAMI (Java framework for molecular interactions)● XML/MITAB validator prototype● Enricher
![Page 8: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/8.jpg)
MIRIAM and Identifiers.org
![Page 9: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/9.jpg)
Introduction: http://identifiers.org/about
![Page 10: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/10.jpg)
MIRIAM/identifiers.org benefits
● PSICQUIC links to data entries (pubmed, uniprot, ensembl...)
➢ Automatic remapping when services down → more reliable links
● Up to date resource with database accession regular expressions
➢ Do not duplicate work in psi-mi ontology
![Page 11: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/11.jpg)
More reliable PSICQUIC links (1)• Several locations/resources for accessing uniprot P00533
3 existing resources for accessing P00533
Identifiers.org/uniprot/P00533
![Page 12: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/12.jpg)
More reliable PSICQUIC links (2)• Use the most reliable location/resource for uniprot P00533
Identifiers.org/uniprot/P00533?profile=most_reliable
![Page 13: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/13.jpg)
More reliable PSICQUIC links (3)• Use the uniprotkb location/resource for uniprot P00533
Identifiers.org/uniprot/P00533?resource=MIR:00100134
![Page 14: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/14.jpg)
Up to date database links and regular expressions (1)
![Page 15: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/15.jpg)
Up to date database links and regular expressions (2)
![Page 16: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/16.jpg)
What next?● The CV MI database terms should have xrefs to MIRIAM
namespace
● The regular expressions in the database MI terms could be obsoleted to rely on MIRIAM
- Hierarchy information - No data/formats update - Relies on MIRIAM for the regular expressions and links
- More work for the MI CV maintainers.- MIRIAM namespaces not visible in MITAB/XML- Need to update PSI-XML validator
Maybe XML 3.0?
![Page 17: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/17.jpg)
MITAB 2.7 and MIQL 2.7
![Page 18: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/18.jpg)
MITAB 2.7: introduction
● Format description at http://code.google.com/p/psicquic/wiki/MITAB27Format
● Extension of MITAB 2.6 and 2.5
● Total of 42 column
Can contain minimum information recommended by MIMIx
![Page 19: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/19.jpg)
MITAB 2.7: Complex expansion
● Distinguish true binary interactions from binary interactions expanded from n-ary interactions● Know the method used to expand
● Spoke● Matrix● Bipartite
● psi-mi:”MI:1060” (spoke expansion)● psi-mi:”MI:1061” (matrix expansion)● psi-mi:”MI:1062” (bipartite expansion)
Recognized for backward compatibility
![Page 20: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/20.jpg)
MITAB 2.7: re-build n-ary from spoke expansion?
A BC D
Interaction id 1
Interaction id 2
E FG
5 binary interactions 2 n-ary interactions
bait prey
A B
C
D
bait
A
A
bait
prey
prey
● Interaction id 1● Spoke
● Interaction id 1● Spoke
● Interaction id 1● Spoke
E F
G
bait
E
bait
prey
prey
● Interaction id 2● Spoke
● Interaction id 2● Spoke
Need ● interactor id● expansion
method● interaction id
Not enough ● Publication● Detection method● Host organism● Interaction type
![Page 21: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/21.jpg)
MITAB 2.7: re-build n-ary from bipartite expansion?
A BC D
Interaction id 1
Interaction id 2
E FG
7 binary interactions 2 n-ary interactions
interactionI1 A
interactor
● Bipartite
Need ● interactor id● expansion
method● interaction id
interactionI1 B
interactor
● Bipartite
interactionI1 C
interactor
● Bipartite
interactionI1 D
interactor
● Bipartite
interactionI2 E
interactor
● Bipartite
interactionI2 F
interactor
● Bipartite
interactionI2 G
interactor
● Bipartite
![Page 22: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/22.jpg)
MITAB 2.7: re-build n-ary from matrix expansion?
A BC D
Interaction id 1
Interaction id 2
E FG
9 binary interactions2 n-ary interactions
CA
DA
A B
● Interaction id 2● Matrix
Need ● interactor id● expansion
method● interaction id
Not enough ● Publication● Detection method● Host organism● Interaction type
CB
DB
CD
GE
EF
GF
● Interaction id 2● Matrix
● Interaction id 2● Matrix
● Interaction id 1● Matrix
● Interaction id 1● Matrix
● Interaction id 1● Matrix
● Interaction id 1● Matrix
● Interaction id 1● Matrix
● Interaction id 1● Matrix
![Page 23: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/23.jpg)
MITAB 2.7: MIMIx columns
● Participant's biological roles (col 17 and 18)➢ Ex: psi-mi:”MI:0684” (ancillary)
● Participant's experimental roles (col 19 and 20)➢ Ex: psi-mi:”MI:0496” (bait)
● Participant identification methods (col 41 and 42)➢ Ex: psi-mi:”MI:0113” (western blot)
● Host organism for the experiment (col 29)➢ Ex: taxid:-1 (in vitro)
![Page 24: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/24.jpg)
MITAB 2.7: new types of interactions accepted
● Negative interactions (col 36)
● Self interactions:– homodimers, homotrimers, …
– auto-catalysis, …
P P
P
Inter-molecular
Intra-molecular
Unique id A (col 1)
Unique id B (col 2)
…. Stoichiometry A(col 39)
Stoichiometry B (col 40)
P P ... x 0
Unique id A (col 1)
Unique id B (col 2)
…. Stoichiometry A(col 39)
Stoichiometry B (col 40)
P - ... 1 -
![Page 25: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/25.jpg)
MITAB 2.7: interactor types
● Columns 21 and 22➢ Ex: psi-mi:”MI:0327” (peptide)
● Solve some ambiguity with interactor identifiers
● More precise than registry tags
![Page 26: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/26.jpg)
MITAB 2.7: interactor and interaction xrefs
● Interactor xrefs (col 23 and 24)
● Interaction xrefs (col 25)➢ Ex: go:"GO:0005057"(receptor signaling protein
activity)
➢ Ex: intact:EBI-626658(see-also)
• To give more information about interactor or interaction
• Not an identifier• Allows to lighten the 6 first columns• Not used for clustering• use cross reference type
![Page 27: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/27.jpg)
MITAB 2.7: interactor and interaction annotations
● Interactor annotations
(col 26 and 27)
● Interaction annotations
(col 28)➢ Ex: dataset:Cancer - Interactions
investigated in the context of cancer
➢ Ex: imex-curation
PSICQUIC Registry tags
![Page 28: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/28.jpg)
MITAB 2.7: participant's features● Users want to ask: “show me all evidence where molecule X has binding domains”
Binding sites AND other features (eg. Tags, PTMs,..)
yes
binding site:51-124(IPR003651)binding site:45..53-119..129binding site:n-51,99-123gst tag:c-chis tag:?-?
no
51-124(IPR003651)45..53-119..129n-51,99-123c-c?-?
![Page 29: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/29.jpg)
MITAB 2.7: more...
● Interaction parameters (col 30)➢ Ex: kd:9.0x10^-7 (molar)
● Creation date (col 31)➢ Ex: 2011/03/15
● Last update date (col 32)➢ Ex: 2011/04/05
● Interactor checksum (col 33 and 34)➢ Ex: rogid:bjwQTTv7ws6z/T+fM8bNGnEsEXk6239
● Interaction checksum (col 35)➢ Ex: rigid:G6RtLd3+FtR/ZtRciwH2vj9R0Tc
![Page 30: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/30.jpg)
MITAB 2.7: limitation and issues
● 42 columns!
● Feature, checksum, confidence and parameter types can only be names
● Cannot represent linked features and inferred interactions
● Cannot export feature xrefs and annotations
● Not all the columns have the same syntax
● Same syntax does not mean same content
● Cell types, tissues and compartments cannot be specified in host organism column.
• Issue when converting to XML where Xref is mandatory
• Cannot recognize MI from MOD terms• Names can be ambiguous
![Page 31: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/31.jpg)
MITAB: what next?
● Only column names● A syntax per column● Customize....
– Number of columns
– Order of columns
![Page 32: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/32.jpg)
MIQL 2.7: introduction
● Fields description at http://code.google.com/p/psicquic/wiki/MiqlReference27
● Extension of MIQL 2.5
● Total of 35 fields
![Page 33: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/33.jpg)
MIQL 2.7: new fields
![Page 34: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/34.jpg)
MIQL 2.7: examples➢ I want to filter out expanded binary interactions
➢ Complex:”-”
➢ I want to include negative interactions➢ negative:(true OR false)
➢ I want all interactions having parameters➢ param:true
➢ I want all interactions having stoichiometry➢ stc:true
➢ I want all interactions having binding sites➢ ftypeA:”binding site” AND ftypeB:”binding site”
➢ I want all intra-molecular interactions➢ idA:\- OR idB:\-
➢ I want all interactions internally-curated➢ annot:”internally-curated”
![Page 35: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/35.jpg)
What should we do?● Export and index MITAB 2.7
➢ Complex expansion
➢ MIMIx information
➢ Registry tags and tagging interaction
● Use PSICQUIC registry tags that are important at the interaction level
● Move ➢ Gene names and other names to alias columns (col 5 and 6)
➢ Extra unique identifiers to alternative identifiers (col 3 and 4)
➢ Rogid, Inchi key and rigid to checksum columns (col 33, 34 and 35)
➢ GO and non identifiers to xref columns (col 23, 24 and 25)
![Page 36: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/36.jpg)
PSICQUIC clustering
![Page 37: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/37.jpg)
Clustering binary interactions• Clustering = regrouping multiple interaction evidences of a
unique pair of interactors in a single MITAB line.
• It boils down to grouping molecule pairs, hence the importance of describing your molecules properly
• Necessary for a user doing data analysis and interaction networking
• http://code.google.com/p/micluster/
A-B : Y2HA-B : CIPA-C : Y2HA-B : pull downA-D : pull down
A-B : Y2H | CIP | pull downA-C : Y2HA-D : pull down
![Page 38: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/38.jpg)
How to deal with ambiguous identifiers?
• Depends on the list of identifiers provided by each PSICQUIC service
= 1 interaction but should it be 2?
- Uses one identifier per species- ambiguous identifiers (uniprot gene and organism demerge) can be moved to xrefs
A1-B : A1 → uniprotkb:Q5R7D3|uniprotkb:P08107
+A2-B : A2 → uniprotkb:Q5R7D3
1
2
A2-B : A2 → uniprotkb:P081073
![Page 39: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/39.jpg)
Should we cluster MITAB 2.7?● Lose experiment/interaction hierarchy : some information are
specific to the experiment!– Experimental roles
– Interaction parameters
– Features and tags
– Host organism
● Some fields are confusing when clustered– Complex expansion
– Interactor types
– Negative
– Stoichiometry
● Some fields make sense associated with source● Created date● Update date
![Page 40: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/40.jpg)
Clustering improvements
● Relying on aliases for identifying molecule? => names are not identifiers
● Proposing other clustering options? (sequence+organism, checksum)
● Respect Data Distribution Best practices avoids inconsistent results => better data integration and analysis for the user
![Page 41: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/41.jpg)
Clustering alternatives● Clustering unique binary pairs during
indexing?Ex: a new field 'binary': identifier1-identifier2
● Getting the unique binary pairs is instantaneous
● Can have statistics related to a binary pair
● Identifiers always sorted so always same order
● Possibility to keep relationships of original MITAB
● Needs to agree on common identifiers
● Needs regular protein updates● Not flexible if several identifiers
![Page 42: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/42.jpg)
New PSICQUIC reference implementation
![Page 43: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/43.jpg)
LUCENE reference implementation 1.2.3
MITAB 2.5
Lucene indexing (3.0)Calimocho 2.5.0Psimitab parser 1.8.3
PSICQUIC 1.2
MIQL 2.5 (14 fields)
tab25 (default)tab25-binxgmmlBiopaxRDF
● Fix some memory issues (pagination, threads, …)
● Use psimitab parser and XML converter 1.8.3 with bug fixes
● Improved performances XGMML export (no limits of 5000 interactions)
![Page 44: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/44.jpg)
SOLR reference implementation 1.3.9
MITAB 2.5 SOLR indexing (3.6.0)Calimocho (2.5.0)Spring batch
PSICQUIC 1.3
MIQL 2.7 (35 fields)
tab25 (default)tab26tab27xgmmlBiopaxRDF
● Use psimitab parser and XML converter 1.8.3 with bug fixes (can convert MITAB 2.7 to PSI-XML 2.5)
● Improved performances XGMML export (no limits of 5000 interactions)
● Common SOLR schema
MITAB 2.6
MITAB 2.7
![Page 45: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/45.jpg)
What is SOLR?
● Web application and web server
● Based on LUCENE => compatible with MIQL
● SolrJ: java API to index/search
● HTTP requests to SOLR
● Caching results
● Provides admin interface
– Browse indexed data
– Access schema and configuration
– Server, cache and index statistics
![Page 46: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/46.jpg)
SOLR admin interface
Help/documentation Query
Schema, config, statistics
![Page 47: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/47.jpg)
SOLR results interfaceQuery parametersQuery parameters
Number of results
Document and 'stored' fields
![Page 48: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/48.jpg)
What is faceting?
● Breaks up search results into multiple categories
● Show counts for each category (facet field)
● Allows user to restrict/filter search based on those facets
Provides statistics about the content of the results for a given query
![Page 49: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/49.jpg)
Example of faceting
Facet results
facet=trueFacet.field=species_s
![Page 50: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/50.jpg)
Search: how is data indexed? (1)
● MIQL 2.7 fields indexed but not stored
● Bug fix: split by ':' and duplicated terms!➢ Ex: MI:0356 => MI, 0356
➢ Ex: taxid:9606(human)|taxid:9606(homo sapiens) => taxid, 9606, human, taxid, 9606, homo, sapiens
● Default fields (free text search)➢ Identifier, pubauth, pubid, interaction_id, detmethod,
type, species
![Page 51: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/51.jpg)
Search: how is data indexed? (2)
● Database, value and text for general xrefs➢ Ex: uniprotkb:P12346 => uniprotkb, P12345 and uniprotkb:P12345
➢ Ex: taxid:8906(human) => taxid, 9606, human and taxid:9606
➢ Ex: uniprotkb:brca2(gene name) => uniprotkb, brca2, “gene name” and uniprotkb:brca2
● Features, annotations➢ Ex: figure legend:Fig 3. => “figure legend”, “Fig 3.”
➢ Ex: binding site:12-12(text) => “binding site”
● Negative (always excluded by default!)➢ Ex:' -' or false => false
● Parameters and stoichiometry➢ Ex:' 1' or 'kd:9.0x10^-7 (molar)' => true
➢ Ex: '-' => false
● Publication first author– Ex:'author (date)' => “author”, “date”
![Page 52: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/52.jpg)
Search: how is data indexed? (3)● Ignore parenthesis
● Case insensitive
● Discard common english words (a, with, …)
● Discard empty space before and after a word
● White space tokenizer => search for exact words● Ex: BRCA2 will not match BRCA2b● Ex: P12345 will not match P12345-1 => use P12345*● Ex: experimental will match both 'experimental method' and
'experimental feature'
![Page 53: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/53.jpg)
What is stored and returned?
● MIQL fields + non searchable fields ending with '_o'➢ Ex: taxidA_o, pbioroleA_o, checksumA_o
● Excludes copy fields● Id, alias, identifier, ptype, pbiorole, ftype, species, pmethod
● Stores the original MITAB column
● Missing fields are automatically replaced by '-'
![Page 54: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/54.jpg)
PSICQUIC facet fields
● MIQL fields ending with '_s'➢ Ex: species_s, pbiorole_s
● Stores the original MITAB cross reference➢ Ex: taxid:9606(human) => taxid:9606
● Exact match
● Excludes text
![Page 55: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/55.jpg)
Current indexing issues and possible improvements
● More default fields?
● Alias names: fuzzy search allowed?
● Annotation description: fuzzy search should be allowed
● Sort fields cannot be multivalued!
➢ Unique identifier?➢ MITAB not clustered => controlled vocabulary terms➢ Current issue with publication (pubmed, imex) ➢ Cannot sort by annotations and xrefs!
![Page 56: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/56.jpg)
SOLR and PSICQUIC installation
![Page 57: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/57.jpg)
PSICQUIC webservice extensions
● Add a sort parameter
● Allowing faceting
➢ Define method name (not getByQuery for backward compatibility)
➢ Use SOLR XML to return facets or facets embedded in the response?
![Page 58: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/58.jpg)
Current PSICQUIC specifications issues
● SOAP and REST discrepancies
➢ Do we maintain both?➢ Should we update SOAP with new REST
methods?
● Update and improve documentation, bug tracker, FAQ
![Page 59: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/59.jpg)
PSICQUIC view update
![Page 60: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/60.jpg)
Data Distribution Best Practices
![Page 61: PSI meeting 2013 - psidev.infopsidev.info/sites/default/files/2018-03/psi_april_2013.pdf · PSI meeting 2013 IntAct team intact-help@ebi.ac.uk. Outline Summary of 2012/2013 activities](https://reader033.fdocuments.us/reader033/viewer/2022060219/5f06e73b7e708231d41a4f0a/html5/thumbnails/61.jpg)
Master headline
????
??? ?
??
?
?
?
?
?
?
??
?
?
? ?
?