The Royal Society London, May 19-21st, 2010Mouse models for human disease Phenotype database...
-
Upload
stephany-stewart -
Category
Documents
-
view
213 -
download
0
Transcript of The Royal Society London, May 19-21st, 2010Mouse models for human disease Phenotype database...
The Royal Society London, May 19-21st, 2010Mouse models for human disease
Phenotype database interoperability and integration
Damian Smedley, EBI
The Royal Society London, May 19-21st, 2010Mouse models for human disease
Why do we need data integration and interoperability?
The Royal Society London, May 19-21st, 2010Mouse models for human disease
Centralised vs distributed solutions
Genomics
MGI
Ensembl
IKMC projects
KOMP EUCOMM NorCOMM Eurexpress/GXD etc
JaxMice
Phenotype/Expression
Strains
IMSR EMMA
EurophenomeTIGM
portal
Centralised warehouse v1
Centraldatabase
Centralised warehouse v2 Distributed solution
nightly data syncsweb services
The Royal Society London, May 19-21st, 2010Mouse models for human disease
Centralised solutions
Advantages– Better query performance for large datasets– Easier to analyse raw data in one location
Disadvantages– Regular data deposition is non-trivial– Designing a single schema to store different types
of data is not simple.– Persuading people to “give up” their
data/databases/websites– Will still need to make interoperable with other data
sources
The Royal Society London, May 19-21st, 2010Mouse models for human disease
Distributed solutions
Advantages– Domain expertise at production site exploited– Different types of data easily integrated as long as they share
something in common such as a gene identifier– No need for nightly data flow to keep data up to date– No need for redundant data in each database– Easier to persuade people to collaborate in a distributed scenario
Disadvantages– Technical knowledge required to deploy the web services– Potential query performance problems for large datasets (may need
to provide summary level data)– Potential problems performing analysis over all datasets– Problems with services going down
The Royal Society London, May 19-21st, 2010Mouse models for human disease
1000 Genomes - centralisation
The Royal Society London, May 19-21st, 2010Mouse models for human disease
International Cancer Genome Consortium
CanadaPancreas
AustraliaPancreas
ChinaStomach
JapanLiver (virus related)
FranceLiver (alcohol-related)
Breast (HER2+ve)
UKBreast (several subtypes)
SpainCLL
IndiaOral Cavity
The Royal Society London, May 19-21st, 2010Mouse models for human disease
Joint Ensembl and EurExpress query
The Royal Society London, May 19-21st, 2010Mouse models for human disease
IKMC portal: knockoutmouse.org
GXD
EurexpressNorCOMM
EUCOMM
KOMP
TIGM
EMMAKOMP rep
CMMRIMSR
Ensembl
CREATE
Europhenome
The Royal Society London, May 19-21st, 2010Mouse models for human disease
IKMC interoperability strategy
IKMC
Sanger, UK
ES cells + lines
EMMA (UK), KOMP (USA), CMMR (Canada)
Harwell, UK
Phenotype(EuroPhenome etc)
JAX, USA
MGI
Edinburgh, UK
EURExpress
Sanger, UK
Ensembl
JAX, USA
GXD
CREATE
EBI, UK
BioMart query interface(s)
MGI ID
MGI ID
MGI ID
MGI ID
MGI ID
MGI ID
MGI ID
The Royal Society London, May 19-21st, 2010Mouse models for human disease
www.knockoutmouse.org/martsearch
The Royal Society London, May 19-21st, 2010Mouse models for human disease
Europhenome: raw and summary data
The Royal Society London, May 19-21st, 2010Mouse models for human disease
Possible strategy for phenotype data
BioMart query interface(s)
IKMC
Sanger, UK
ES cells + lines
EMMA (UK), KOMP (USA), CMMR (Canada)
MGI ID
JAX, USA
MGI
Edinburgh, UK
EURExpress
Sanger, UK
Ensembl
MGI ID
MGI ID
MGI ID
MGI ID
JAX, USA
GXD
MGI ID
CREATE
EBI, UK
Centraldatabase
High thoughput phenotyping centres
Presentation of raw results
Analysis to assign phenotypes to genes
MGI ID
High throughput phenotyping
The Royal Society London, May 19-21st, 2010Mouse models for human disease
Linking from IKMC portal
Phenotyping
Phenotype searches
The Royal Society London, May 19-21st, 2010Mouse models for human disease
Mouse models for human disease
The Royal Society London, May 19-21st, 2010Mouse models for human disease
Acknowledgements
The whole CASIMIR consortium and in particular:• Paul Schofield, Michael Gruenberger, Chao-Kung Chen, George Gkoutos,
Ann-Marie Mallon, John Hancock: MouseFinder tool.
• MartSearch: Vivek Iyer, Darren Oakley, Bill Skarnes
• BioMart: Arek Kaspryzk, Syed Haider, Edoardo Marcora