Crux flexible, structured data reporting for funding agencies.

Post on 20-Jan-2016

216 views 0 download

Tags:

Transcript of Crux flexible, structured data reporting for funding agencies.

Cruxflexible, structured data reporting

for funding agencies

The Challenge

MJFF has a comprehensive, large-scale funding program (~40 programs, ~680 funded projects)

Final reports are written documents, summarized as abstracts.

e.g., Prochaintz & Harmann (Fast track 2007):

‘… the protein Engrailed is able to rescue the neurons most affected in Parkinson’s disease …’ *

What is the data supporting this claim?

* From MJFF Website

Current state-of-the-art ‘Data Sharing’

File sharing systems provide centralized, secure access to files

But this is ‘just’ a complex shared directory.

The https://brainfu.org site

* with apologies to the brainfu designers.

*

How to keep track of the science easily and efficiently?

Capturing Scientific Communication

• Two primary areas of communication breakdown

1. Experimental Procedures

2. Experimental Data

Experimental Procedures• Poorly described at the beginning (grant proposal) and end of the process (publication)

• Difficult to replicate experiments

• Difficult to design the next experimental step

• Difficult to track changes

Experimental Data• Only summary data available during grant assessment and after publication

• Difficult to evaluate progress during grant assessments

• Greatly diminishes utility of meta-analysis

• Results are decoupled from the supporting raw data

Data Files

• Lab Notebooks, Computer Files, Images, Spreadsheets, Text Documents, etc.

• Sometimes imported into ad-hoc summary databases

• Data decoupled from experimental procedure

Inconsistency + Evolution

• Over the course of any project:

• Procedures and data formats may change

• Original requirements may change or be refined

• Data schema will be updated to reflect this but what happens to data in previous versions of the database?

Why?

• Why is this happening?

Capturing and communicating this information using currently available methods is burdensome and haphazard.

• Why is this a problem?

Too much critical data stays in the researcher’s head and private notebooks.

What Is Crux?• A user friendly tool to record all critical data at point of funding and throughout the course of research.

• Use standard terminology to describe data- Precise curation using community bio-ontologies

• Formats are easily understood by scientists and Funders• Data, procedures, and experiment designs are permanently linked and stored in a unified file system.

This presentation

1. Walkthrougha. Crux web-applicationb. Experimental design toolc. OBI ontology curationd. Designs and experimental data for Emborg + Codmane. Upload / download of data using spreadsheets

2. Demonstrationa. Enter a new experimental design LIVE!

3. Overview of workload associated with model construction, data entry and ontology curation

4. Overall Accomplishments 5. Preliminary description of plans for year 2

How it Works – A Walkthrough

a demonstration of experiment management with crux

*image by webtreats

1. Dashboard

• Serves as a focal point summarizing the contents of the system

• Each experiment listed separately

• Links to: - metadata- design- data

2. Experimental Design

• The Experimental Design Diagram

• Protocol represented as a diagram

• Clear complete description of the experiment

From Experimental Design

to Data Design

Codman in-vivo

Dependency between measurements and parameters is provided by tracing backwards through the protocol.

Note that model is not complete, but works well for simple measurements + analysis

Simple Data Gathering

• Low burden on the investigator

• Spreadsheets used as the data forms

• Files and images are uploaded into Crux with the data

Experiments in the demo

Experiments in system:

• Codman - in-vitro- in-vivo

• Emborg - in-vitro- ex-vivo- in-vivo

Standardized Terminology

• Professional curation produces accurate and clearly understandable experiment designs

• Using NIH-funded community repositories of curation terms enables data mining and sharing

• Crux allows a curator to associate professionally-curated terms with experimental elements

Curation In Parallel

• Reviewers and Investigators are not burdened by the curation process

• The system gets more and more useful as the curation gets more detailed

A Complete Solution

• Clear evolvable experimental descriptions

• Automatic database creation

• Data import through spreadsheets

• Data query, report and export

DescribeDescribe

GatherGather

QueryQuery

Demo Walkthrough –From Design to Data

SYSTEM

DEMONSTRATION

Demo Walkthrough –Application Flow

M.J.F.F. Programs

Experimental Programs: Discovery / Translational / Clinical / (+ some I.T. development, such as this project).

The long-term objective of this system is to enable any scientist in an executive role (program officers, journal editors, foundation executives) to do more than address the question: ‘were the goals of the project fulfilled?’

We want to provide data management, sharing and analysis so that these individuals can guide and accelerate research by

(a) gaining access to all the data

(b) performing meta-analysis and knowledge synthesis over that data.

Further Development

• The 2010 system was funded as a proof-of-concept prototype

• Future funding targets Beta software for use at MJFF, the Kinetics Foundation and other agencies if possible

• Developers will work closely with beta testers in MJFF and The Kinetics Foundation

• Crux will also be developed and distributed as a capability within the Biomedical Informatics Research Network (BIRN, http://www.birncommunity.org/), a community organization building infrastructure for the biomedical community.

Year 2 Plans

• Development of a system tailored to manage a well-specified set of grants pertaining to a family of experiments conforming to a well-developed experimental design (i.e. infusion studies)

• ‘Dashboard’ display will form central focal point and be based on a summary of designs, evolution of these designs and the presence of data within them

• Proposed additional features will include: - data visualization - structured coordination with professional curators- inter-experiment querying