BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony...
-
Upload
lisette-giepmans -
Category
Health & Medicine
-
view
350 -
download
1
Transcript of BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony...
Making data useful without direct sharing:Cafe Variome and Omics browsing
• CANNOT: Data owners may not have time nor funding to manually submit data, and/or submission process and requirements too complicated
• WILL NOT: Data owners receive little or no recognition or reward for releasing data, hence little incentive to try
• MUST NOT: Data owners may have good reasons for not sharing data (ethical, legal, competitive edge)
Issues that restrict sharing data
DATA SHARING
IS IMPORTANT
BUT DIFFICULT !
SO DO
SOMETHING
ELSE
(AS WELL)
Share the ‘existence’ rather than the ‘substance’ of data
This technology (or similar) sits atop/alongside existing local DBs to bring the discoverability and connectivity, without replacing or altering the local solutions
Use Cases/Collaborating Networks
• Designed to be flexible for a number of use cases
• Various groups using the tool in different ways
– Rare disease (variant is the “entity”)
– Patient centric (patient is the “entity”)
– Aggregate frequency (i.e. mutation seen with a frequency of X in population
Café Variome Features
• Café Variome is not a database but is a searchable 'menu’
• The platform enables data custodians to specify which users can search for, display counts of, or display details of, which subsets of records and record fields, using various search parameters
• Results can be returned to users:- as core data- as links to data at source- by computationally facilitating data access requests
• Federated Café Variome network
• Nodes populated with local data
• Data discovery/sharing options under control of each source
• Data remain at each source
• Search interface enables real-time data/subject discovery
• Each discovered record reported in one of 3 ways, dependant on users permissions and data source settings
Open Access Restricted Access Linked Access
Data provided Interface facilitates email to request data, followed by
data supply if/when approved
No data provided,only link to data source
Source DB
• Each source can control which data fields are searchable and which fields are (potentially) then returned
Simple Query Interface
Complex Query Interface
Query Builder in action
Controlled Display of Matched Record Counts& Data (if permitted)
Phenotype Semantics
• Allow the phenotypic consequences of genetic entities to be described using public ontologies
– Many terms from many ontologies can be associated with one entity
• Allow the phenotypic consequences of genetic entities to be described using a local vocabulary or list
• Enable hierarchical viewing and querying of the phenotype ontology data
Node Search Options
• Searches are performed through one nominated head node
• Searches can be performed from any node in the network
External searches
Head node
Internal searches
Installation wizard
Appearance preferences, content management system and statistics
reporting
Core system settings, defining displayed and searchable fields, bulk import template
configuration
Record and source management
User, group and record access
control management
MultipleAdmin Options
OmicsConnect:
• Enables collaborators, and (optionally) additional researchers, to view/explore 'omics' datasets
• Provides a mechanism for visual data discovery
• Provides a unified browser view of ‘omics’ data
• Separates data sharing into open, controlled and discoverable
• Cope with different data sources and formats
• Easy to setup and use
GWAS Central(www.gwascentral.org)
- comprehensive genetic association database
- aggregate data & extensive metadata
- links to data sources for primary data
Eg. Visual meta-analysis:Compares and contrasts 8 different studies
The Browser
OmicsConnect browser
LocalData
RemoteData
Data Sources
(DAS, GFF3, BED, wiggle BigWig,
BigBed)
FilesFASTA
GFF3
GTF
Access Local DatabasesMySQL
SQLite
Access Online Resource'sGWAS Central
Ensembl
UCSC
Display Data
Simple Interface
Use new technologies
Low Demands on Resources
Platform Independent
No dependencies
OmicsConnect browser (Dalliance)
Track authentication
DAS track enabled when passphrase entered
DAS track not enabled as no passphrase entered
• Allows researchers to controllably serve their own omics data
– Authentication (public/private)
– User accounts
• Can returns the available features for a specific file/genome segment
• Intuitive interface for upload and management of data, including validation
• Stylesheets: Instructions on how to format the data for viewing.
• Additional feature implementations to the DAS protocol- ‘Types’: Returns what data types exist in the DAS track- ‘Summary’: Returns a summary of the data features per segment- ‘Search’: Returns features based on a keyword given by the user
• Can be installed independently from OmicsConnect
eDAS ‘gate keeper’
Other Genome Browsers
EnhancedDistributed Annotation
System (eDAS)
Raw Files
Local Databases
Online Resource's
Remote Access
Local Access
OmicsConnect
Browser
Online Resource's
OmicsConnect & eDAS
Acknowledgements
The research leading to these results has received funding from the EC under the 7th Framework Programme (FP7/2007-2013) grant agreements 261433 (BioSHaRE-EU) and 200754 (GEN2PHEN), and the IMI projects grants 115372 (EMIF) and 115736 (EPAD)
Tim Beck Robert HastingsCharalambos Chrysostomou Robert FreeAdam Webb Owen LancasterDhiwagaran Thangavelu Colin Veal
Morris Swertz et alAlliance
Consortium