cBioPortal Webinar Slides (3/3)

27
cBioPortal: opportunities for use in a commercial setting PISTOIA ALLIANCE WEBINAR: BUILDING A VIBRANT CBIOPORTAL COMMUNITY JULY 22, 2015 Kees van Bochove, CEO The Hyve

Transcript of cBioPortal Webinar Slides (3/3)

cBioPortal: opportunities for use in a commercial setting PISTOIA ALLIANCE WEBINAR: BUILDING A VIBRANT CBIOPORTAL COMMUNITY

JULY 22, 2015

Kees van Bochove, CEO The Hyve

2

Agenda

1.  Introduction of The Hyve

2.  Open Source in Translational Medicine

3.  Experiences with cBioPortal in TraIT

4.  The Hyve’s Project Approach and Services

1.

INTRODUCTION

3

4

The Hyve

u  Professional  support  for  open  source  so0ware  for  bioinforma4cs  and  transla4onal  

research  so0ware,  such  as  tranSMART,  cBioPortal,  i2b2,  Galaxy,  ADAM  and  OHDSI          

 

 

Mission  Enable  pre-­‐compe44ve  collabora4on  in  life  science  R&D  by  leveraging    open  source  so,ware  

Core  values   Share        Reuse      Specialize  

Office  Loca5ons  Utrecht,  Netherlands  Cambridge,  MA,  United  States  

Services  So0ware  development  Data  science  services  Consultancy  Hos4ng  /  SLAs  

Fast-­‐growing  Started  in  2012  30  people  by  now  

Interdisciplinary team

so0ware   engineers,   data   scien4sts,   project   managers   &   staff;   exper4se   in  bioinforma4cs,  medical  informa4cs,  so0ware  engineering,  biosta4s4cs  etc.    

5

2.

OPEN SOURCE IN TRANSLATIONAL MEDICINE

6

h<p://lanyrd.com/2015/innova5on-­‐spotlight-­‐session  

Presenta5on  on  cBioPortal  by  Niki  Schultz  was  very  well  received,  by  a<endees  from  pharma  companies  as  well  as  academic  medical  centers  

Open Source u  Source code openly accessible and reusable for everyone

u  Enables pre-competitive collaboration: both academics and

industry can use and enhance it

u  Transparency: verification (scientific as well as IT security) can be

done by anyone, no ‘black box’

The Open Source Definition 1. Free Redistribution

2. Availability of Source Code

3. Allow Derived Works

4. Integrity of The Author's Source Code

5. No Discrimination Against Persons or Groups

6. No Discrimination Against Fields of Endeavor

7. Redistribution of License

8. License Must Not Be Specific to a Product

9. License Must Not Restrict Other Software

10. License Must Be Technology-Neutral

The software engineering process in an open source community is not different from a closed commercial setting…

But the stakeholders, contributors, business models, engagement models

etc. are!

Different Non-Functional Requirements for Software

u  Bioinformatician in academics:

create a novel solution for a

problem which has publication

value

u  Basic Research: new frontiers

u  Software should demonstrate working

principle

u  Bioinformatician / IT Services in

pharma/clinic: mainly applied

research:

u  Software should be well tested,

maintainable, extensible, scalable

etc.

u  Need for commercial support for

open source software

12

Open Source in Precision Medicine

Study design:

Biobanking:

Scientific compute:

Data visualisation:

Workflow / NGS:

Datawarehousing:

Imaging:

Clinical - eCRF & apps:

3.

CBIOPORTAL: EXPERIENCES SO FAR

13

14

Center for Translational Molecular Medicine (CTMM) u  Public-private consortium

u  Dedicated to the development of Molecular

Diagnostics and Molecular Imaging technologies

u  Focusing on the translational aspects of molecular

medicine.

u  120 partners

u  universities, academic medical centers, medical

technology enterprises and chemical and

pharmaceutical companies.

u  Budget 300 M€

u  22 projects / research consortia

u  TraIT is the Translational Research IT project

supporting these projects with a joint IT infrastructure

15

TraIT Consortium

Growing TraIT project team

16

Colon cancer study in TraIT

GenePrint Visualisation (from cBioPortal) in tranSMART

17

18

TM2CBIO

u  In collaboration with Netherlands Cancer Institute

u  ETL pipeline between tranSMART and cBioPortal

u  TranSMART used as data warehouse, and cBioPortal as a

study-based analytics mart for cancer studies

u  Going from individual data points (e.g. mRNA intensity

levels) in tranSMART to alteration events in cBioPortal

19

TranSMART Open Source History u  February 2012: J&J releases tranSMART as open

source on GitHub under GPL v3

u  December 2012: CTMM TraIT project decides to use

tranSMART as core infrastructure component

u  January 2013: IMI eTRIKS starts, uses tranSMART as

core infrastructure component

u  February 2013: kickoff of tranSMART Foundation, U.

Michigan publishes PostgreSQL port

u  March 2014: IMI EMIF kickoff, tranSMART is used as

data integration component

4.

THE HYVE – SERVICES & PROJECT APPROACH

20

The Hyve - cBioPortal Services

u  Software Development: tailored software development (e.g. building in

new functionality, developing connectors to other software)

u  Data Services: data curation, data loading (ETL), data visualization for

bioinformatics, building bioinformatics pipelines (e.g. event calling)

u  Consultancy: requirements gathering, project definition and

implementation, advice on application landscape

u  Service Contracts: SLA’s for application maintenance and hosting, in

the cloud or on premise, data service contracts

Project Approach

Translation Research Projects

22

Defini5on  

Implementa5on  +  

Support  

Evalua5on   Pilot  

Phase 1 Definition -  Obtain overview of business and

scientific processes, to be supported

by a central data management

tool / data mart, e.g. cBioPortal

-  Identify target use cases and

datasets / data sources for pilot

-  Define pilot project to support those

data management needs using

open source tools, demonstrating

these use cases for the target data

sources

23

Defini5on  

Implementa5on  +  

Support  

Evalua5on   Pilot  

Phase 2 Pilot -  Run pilot project to evaluate the

functionality of the open source tools, as a means to support data integration challenges

-  Demonstrate the capability of the open

source solution to e.g.: -  load and integrate various clinical

and omics data sources and datasets

-  s u p p o r t t h e a n a l y s i s a n d visualization of data

-  Automation of data loading and data

quality management processess

24

Defini5on  

Implementa5on  +  

Support  

Evalua5on   Pilot  

Phase 3 Implementation & support -  Extend the implementation of the solution,

by integrating data from internal data sources and/or external collaborations into a shared repository of pre-clinical, clinical and biomolecular data, to enable data mining activities (e.g. biomarker research and predictive analytics)

-  Add data from relevant public study

repositories and libraries (e.g. GEO, dbGaP, TCGA, CCLE etc.) to facilitate use cases involving both public and in house data

-  Seamless integration with researcher’s

tools of choice (e.g. Spotfire, R), regarding visualization and analysis

-  Support for surrounding departmental

processes and projects, leveraging capabilities across R&D informatics

25

Defini5on  

Implementa5on  +  

Support  

Evalua5on   Pilot  

Phase 4 Evaluation

-  Assess impact of solution based

on defined goals.

-  These goals are targeted with

the motto ‘Think BIG, start

small’.

26

Defini5on  

Implementa5on  +  

Support  

Evalua5on   Pilot