Data integration with STRING
-
Upload
lars-juhl-jensen -
Category
Science
-
view
109 -
download
4
description
Transcript of Data integration with STRING
Data integration with STRING
Lars Juhl Jensen
association networks
guilt by association
molecular networks
proteins
string-db.org
small molecules
stitch-db.org
non-coding RNAs
data integration
computational predictions
gene neighborhood
Korbel et al., Nature Biotechnology, 2004
experimental data
gene expression
curated knowledge
pathways
Letunic & Bork, Trends in Biochemical Sciences, 2008
many databases
different formats
different identifiers
variable quality
not comparable
hard work
(Ph.D. students)
common identifiers
quality scores
von Mering et al., Nucleic Acids Research, 2005
score calibration
von Mering et al., Nucleic Acids Research, 2005
homology-based transfer
Franceschini et al., Nucleic Acids Research, 2013
missing most of the data
text mining
>10 km
too much to read
computer
as smart as a dog
teach it specific tricks
named entity recognition
comprehensive lexicon
CDC2
cyclin dependent kinase 1
flexible matching
upper- and lower-case
CDC2
Cdc2
spaces and hyphens
cyclin dependent kinase 1
cyclin-dependent kinase 1
name expansions
prefixes and postfixes
CDC2
hCDC2
“black list”
SDS
co-mentioning
counting
within documents
within paragraphs
within sentences
external data
payload mechanism
extra data on nodes
colored halos
text in node popup
URL in node popup
new nodes
ncRNAs
new edges
evidence type
evidence score
text in edge popup
URL in edge popup
legend
branding with logo
you host the data
user accesses STRING
STRING gets data from you
your server must be public
restrict access to STRING
JSON configuration file
TSV data files
node data
edge data
extension node data
extension edge data
web services as alternative
big datasets
get only required data
questions?