The Cielo Project: Towards a Research Analytics Commons

29
The CIELO Project: Towards a Research Analytics Commons Philip R.O. Payne, PhD, FACMI Professor and Chair, College of Medicine, Department of Biomedical Informatics Professor, College of Public Health, Division of Health Services Management and Policy Associate Director for Data Sciences, Center for Clinical and Translational Science Executive-in-residence, Office of Technology Commercialization and Knowledge Transfer

description

The Cielo Project: Towards a Research Analytics Commons

Transcript of The Cielo Project: Towards a Research Analytics Commons

Page 1: The Cielo Project: Towards a Research Analytics Commons

The CIELO Project: Towards a Research Analytics Commons

Philip R.O. Payne, PhD, FACMI

Professor and Chair, College of Medicine, Department of Biomedical InformaticsProfessor, College of Public Health, Division of Health Services Management and Policy

Associate Director for Data Sciences, Center for Clinical and Translational ScienceExecutive-in-residence, Office of Technology Commercialization and Knowledge Transfer

Page 2: The Cielo Project: Towards a Research Analytics Commons

2

Overview

1) Background and Motivation

2) Conceptualization of CIELO

3) Functional and Technical Architecture

4) Future Directions

5) Today’s Objectives

6) Discussion

Page 3: The Cielo Project: Towards a Research Analytics Commons

3

Overview

1) Background and Motivation

2) Conceptualization of CIELO

3) Functional and Technical Architecture

4) Future Directions

5) Today’s Objectives

6) Discussion

Page 4: The Cielo Project: Towards a Research Analytics Commons

Critical Dimensions of a Learning Healthcare System: Systems Thinking Applied to Patient Centered Research

4

Environment and Culture

• Instrumenting the clinical environment

• Generating hypotheses

• Creating a culture of science and innovation

Precision Medicine

• Rapid evidence generation cycle(s)

• ‘omics’• Analytics/decision

support

Data Science

• System-level analyses• Data science• Visualization• Reproducible

analytics

Integrated and High Performing

Healthcare Research and Delivery Systems

Learning from every

patient encounter

Leveraging the best

science to improve care

Identifying and solving

complex problems

Rapid Translation

Our Focus!

Page 5: The Cielo Project: Towards a Research Analytics Commons

5

The Need for Reproducible Research

Creating a high performing healthcare research and delivery system requires both economies of scale and increased efficiencies Timeliness Resource utilization Data “liquidity”

Central to this argument is a need to exchange research findings and evidence between and among stakeholders in a consumable manner Design Data Analysis

Doing so allows for reproducible research with cumulative benefits

Page 6: The Cielo Project: Towards a Research Analytics Commons

6

Why is it Hard to Reproduce Research?

Data sharing alone is insufficient to this task How was data pre-processed? What analytical workflows were utilized? What additional parameters influenced data analysis? How were results “packaged” for dissemination?

Many socio-technical barriers to addressing these questions, including: Intellectual property and data-level concerns Availability of technology platforms/tools Documentation Metadata Standards Many, many other issues…

Page 7: The Cielo Project: Towards a Research Analytics Commons

7

A Community Dialogue

Page 8: The Cielo Project: Towards a Research Analytics Commons

8

BD2K and the Vision for a Research Commons

Phil Bourne’s Vision (Associate Director for Data Science, NIH) “To foster an ecosystem that enables biomedical

research to be conducted as a digital enterprise that enhances health, lengthens life and reduces illness and disability”

Creation of a commons providing for: Cloud infrastructure for data and computing Search Security Reproducibility standards App store

Source: Phil Bourne, “Ask Not What the NIH Can Do For You; Ask What You Can Do For The NIH”

Page 9: The Cielo Project: Towards a Research Analytics Commons

9

Source: Phil Bourne, “Ask Not What the NIH Can Do For You; Ask What You Can Do For The NIH”

Page 10: The Cielo Project: Towards a Research Analytics Commons

10

Overview

1) Background and Motivation

2) Conceptualization of CIELO

3) Functional and Technical Architecture

4) Future Directions

5) Today’s Objectives

6) Discussion

Page 11: The Cielo Project: Towards a Research Analytics Commons

11

Translating a Problem into a Solution: The Problem Definition Process

Establish the Need for a Solution

Justify the Need

Contextualize the Problem

Write the Problem

Statement

Adapted from: Spradlin, “Are You Solving The Right Problem?”, HBR, September 2012

Page 12: The Cielo Project: Towards a Research Analytics Commons

12

CIELO: Enabling Collaborative Data Analytics in Patient-Centered Research

Project Goals:1) Provide members of the research community with

access to an open-source/-standards “app store” for data analysis and software sharing

2) Reduce time and cost of research while enhancing the reproducibility and transparency of data analysis.

3) Evolve and meet emerging community needs

Blue-Sky: not grounded in the realities of the present: visionary <blue–sky thinking> (Merriam Webster Dictionary)

Page 13: The Cielo Project: Towards a Research Analytics Commons

13

Process To-Date for CIELO

Ideation

MVP Development

Stakeholder Review and

Requirements Gathering

MVP Re-Engineering

Process and Outcomes Measure

Team formation and

proposal development

CIEHLO Prototype

Review andFeedback fromStakeholders

IterativeUser-Centered

Design

You Are Here!

Contextualizes

Page 14: The Cielo Project: Towards a Research Analytics Commons

14

CIELHO Conceptual Model

Page 15: The Cielo Project: Towards a Research Analytics Commons

15

Overview

1) Background and Motivation

2) Conceptualization of CIELO

3) Functional and Technical Architecture

4) Future Directions

5) Today’s Objectives

6) Discussion

Page 16: The Cielo Project: Towards a Research Analytics Commons

16

Building on Existing Tools and Approaches

Sharing of Technical Artifacts

Social NetworkingMetadata

Github/Gitlab

Activity FeedsDiscussion Forums

FolksonomySemantic Search

Partitioning of access Bundling code and data Data model harmonization Cross-linkage (URIs/APIs)

Project-level feeds Linkage to metadata

Current ontologies Linkage to social functions

Page 17: The Cielo Project: Towards a Research Analytics Commons

17

CIELHO Workflow Model

Page 18: The Cielo Project: Towards a Research Analytics Commons

18

Community-Defined Requirements

Integration with analogous platforms and tools Ex. Sage Bionetworks Synapse

Incorporation of data security/confidentiality controls Particularly in the context of analyses involving PHI or similarly privileged data

sets

Convergence towards common data model for submission and reuse of data sets

Ex. OMOP

Multi-tiered sharing model Open access Limited access Private (for defined collaborators)

Semantic search and discovery of code and data

Connectivity to linked open data sets

Social networking at a project and individual level

Page 19: The Cielo Project: Towards a Research Analytics Commons

Community-Defined Requirements: Focus for Public Beta Integration with analogous platforms and tools

Ex. Sage Bionetworks Synapse

Incorporation of data security/confidentiality controls Particularly in the context of analyses involving PHI or similarly privileged data

sets

Convergence towards common data model for submission and reuse of data sets

Ex. OMOP

Multi-tiered sharing model Open access Limited access Private (for defined collaborators)

Semantic search and discovery of code and data

Connectivity to linked open data sets

Social networking at a project and individual level

19

Page 20: The Cielo Project: Towards a Research Analytics Commons

20

How Will We Evaluate the CIEHLO?

Page 21: The Cielo Project: Towards a Research Analytics Commons

21

Overview

1) Background and Motivation

2) Conceptualization of CIELO

3) Functional and Technical Architecture

4) Future Directions

5) Today’s Objectives

6) Discussion

Page 22: The Cielo Project: Towards a Research Analytics Commons

22

Future Directions: Shared Execution Environment (VAULT)

Page 23: The Cielo Project: Towards a Research Analytics Commons

23

Overview

1) Background and Motivation

2) Conceptualization of CIELO

3) Functional and Technical Architecture

4) Future Directions

5) Today’s Objectives

6) Discussion

Page 24: The Cielo Project: Towards a Research Analytics Commons

24

Meeting Objectives (1)

Provide a cross-section of stakeholders with a review of current technical functionality and design decisions surrounding the CIELO platform;

Identify needs for future functional/technical extensions to the platform, with a particular emphasis on:1) Shared analytic tool execution environments and

mechanisms2) Data model promotion/harmonization3) Crowd-sourced feedback and rating systems4) Minimum standards for “bundle” population (code

and example data)

Page 25: The Cielo Project: Towards a Research Analytics Commons

25

Meeting Objectives (2)

Identify socio-cultural barriers and opportunities as they relate to creating and sustaining a CIELO user community;

Identify opportunity to promote and fund the ongoing development and adoption of CIELO as it can be positioned as a solution to enhancing research credibility and reproducibility.

Page 26: The Cielo Project: Towards a Research Analytics Commons

26

Meeting Deliverables (1)

Stakeholder verification/validation of current CIELO functionality;

An enumeration and prioritization of future functional needs/requirements;

An enumeration of socio-cultural barriers and opportunities as they related to the creation and sustainability of an adopter/adapter community;

Page 27: The Cielo Project: Towards a Research Analytics Commons

27

Meeting Deliverables (2)

An enumeration of communication, advocacy, and funding targets intended to position CIELO as a solution to enhancing research credibility and reproducibility;

An enumeration of targeted end-users and their communities;

A whitepaper and project plan that formalizes all of the preceding deliverables and provides a “roadmap” for future CIELO development and dissemination efforts.

Page 28: The Cielo Project: Towards a Research Analytics Commons

28

Overview

1) Background and Motivation

2) Conceptualization of CIELO

3) Functional and Technical Architecture

4) Future Directions

5) Today’s Objectives

6) Discussion

Page 29: The Cielo Project: Towards a Research Analytics Commons

29

“Information liberation + new incentives = rocket fuel for innovation” – Aneesh Chopra (The Advisory Board Company)

Philip R.O. Payne, PhD, [email protected]

"Without feedback from precise measurement, invention is doomed to be rare and erratic. With it, invention becomes commonplace” – Bill Gates (2013 Gates Foundation Annual Letter)

“Data is beyond simply quantifying, it is seeing measurement as the intervention” – Carol McCall (GNS Healthcare)