BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony...

26
Making data useful without direct sharing: Cafe Variome and Omics browsing

Transcript of BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony...

Page 1: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

Making data useful without direct sharing:Cafe Variome and Omics browsing

Page 2: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

• CANNOT: Data owners may not have time nor funding to manually submit data, and/or submission process and requirements too complicated

• WILL NOT: Data owners receive little or no recognition or reward for releasing data, hence little incentive to try

• MUST NOT: Data owners may have good reasons for not sharing data (ethical, legal, competitive edge)

Issues that restrict sharing data

Page 3: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

DATA SHARING

IS IMPORTANT

BUT DIFFICULT !

Page 4: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

SO DO

SOMETHING

ELSE

(AS WELL)

Page 5: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

Share the ‘existence’ rather than the ‘substance’ of data

This technology (or similar) sits atop/alongside existing local DBs to bring the discoverability and connectivity, without replacing or altering the local solutions

Page 6: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

Use Cases/Collaborating Networks

• Designed to be flexible for a number of use cases

• Various groups using the tool in different ways

– Rare disease (variant is the “entity”)

– Patient centric (patient is the “entity”)

– Aggregate frequency (i.e. mutation seen with a frequency of X in population

Page 7: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

Café Variome Features

• Café Variome is not a database but is a searchable 'menu’

• The platform enables data custodians to specify which users can search for, display counts of, or display details of, which subsets of records and record fields, using various search parameters

• Results can be returned to users:- as core data- as links to data at source- by computationally facilitating data access requests

Page 8: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

• Federated Café Variome network

• Nodes populated with local data

• Data discovery/sharing options under control of each source

• Data remain at each source

• Search interface enables real-time data/subject discovery

• Each discovered record reported in one of 3 ways, dependant on users permissions and data source settings

Open Access Restricted Access Linked Access

Data provided Interface facilitates email to request data, followed by

data supply if/when approved

No data provided,only link to data source

Source DB

• Each source can control which data fields are searchable and which fields are (potentially) then returned

Page 9: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

Simple Query Interface

Page 10: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

Complex Query Interface

Page 11: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

Query Builder in action

Page 12: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

Controlled Display of Matched Record Counts& Data (if permitted)

Page 13: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

Phenotype Semantics

• Allow the phenotypic consequences of genetic entities to be described using public ontologies

– Many terms from many ontologies can be associated with one entity

• Allow the phenotypic consequences of genetic entities to be described using a local vocabulary or list

• Enable hierarchical viewing and querying of the phenotype ontology data

Page 14: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

Node Search Options

• Searches are performed through one nominated head node

• Searches can be performed from any node in the network

External searches

Head node

Internal searches

Page 15: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

Installation wizard

Appearance preferences, content management system and statistics

reporting

Core system settings, defining displayed and searchable fields, bulk import template

configuration

Record and source management

User, group and record access

control management

MultipleAdmin Options

Page 16: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

OmicsConnect:

• Enables collaborators, and (optionally) additional researchers, to view/explore 'omics' datasets

• Provides a mechanism for visual data discovery

• Provides a unified browser view of ‘omics’ data

• Separates data sharing into open, controlled and discoverable

• Cope with different data sources and formats

• Easy to setup and use

Page 17: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

GWAS Central(www.gwascentral.org)

- comprehensive genetic association database

- aggregate data & extensive metadata

- links to data sources for primary data

Page 18: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

Eg. Visual meta-analysis:Compares and contrasts 8 different studies

Page 19: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

The Browser

Page 20: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

OmicsConnect browser

LocalData

RemoteData

Page 21: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

Data Sources

(DAS, GFF3, BED, wiggle BigWig,

BigBed)

FilesFASTA

GFF3

GTF

Access Local DatabasesMySQL

SQLite

Access Online Resource'sGWAS Central

Ensembl

UCSC

Display Data

Simple Interface

Use new technologies

Low Demands on Resources

Platform Independent

No dependencies

OmicsConnect browser (Dalliance)

Page 22: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

Track authentication

DAS track enabled when passphrase entered

DAS track not enabled as no passphrase entered

Page 23: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

• Allows researchers to controllably serve their own omics data

– Authentication (public/private)

– User accounts

• Can returns the available features for a specific file/genome segment

• Intuitive interface for upload and management of data, including validation

• Stylesheets: Instructions on how to format the data for viewing.

• Additional feature implementations to the DAS protocol- ‘Types’: Returns what data types exist in the DAS track- ‘Summary’: Returns a summary of the data features per segment- ‘Search’: Returns features based on a keyword given by the user

• Can be installed independently from OmicsConnect

eDAS ‘gate keeper’

Page 24: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

Other Genome Browsers

EnhancedDistributed Annotation

System (eDAS)

Raw Files

Local Databases

Online Resource's

Remote Access

Local Access

OmicsConnect

Browser

Online Resource's

OmicsConnect & eDAS

Page 25: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester
Page 26: BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics browsing - Anthony Brookes - University of Leicester

Acknowledgements

The research leading to these results has received funding from the EC under the 7th Framework Programme (FP7/2007-2013) grant agreements 261433 (BioSHaRE-EU) and 200754 (GEN2PHEN), and the IMI projects grants 115372 (EMIF) and 115736 (EPAD)

Tim Beck Robert HastingsCharalambos Chrysostomou Robert FreeAdam Webb Owen LancasterDhiwagaran Thangavelu Colin Veal

Morris Swertz et alAlliance

Consortium