Post on 20-Jan-2015
description
The Path to Open Science with Illustrations from Computational
Biology
Philip E. BourneUniversity of California San Diego
pbourne@ucsd.eduhttp://www.sdsc.edu/pbRelevant Work from Us:
http://www.sdsc.edu/pb/SummaryScholarComm.pdf
MURPHA Sept 8, 2011
My Perspective …• Background in both IT and science (chemistry,
computational biology)• My lab. distributes for free data equivalent to ¼ the
Library of Congress every month• I am a supporter of open access (provided there is a
business model) and editor in chief of PLoS Computational Biology
• I am Co-founder of SciVee Inc. • I am becoming increasingly interested in scholarly
communication
I Readily Acknowledge Each Discipline is Different
My Objective…
• To Excite You to the Changes that Are Taking Place and Get You Thinking on How You Might Participate
What is Open Science
• Open science is the idea that scientific data and knowledge of all kinds should be openly shared as early as is practical in the discovery process.
• Which implies:– Free and unrestricted access to scientific output –
ideas, data, software, the process itself, the knowledge generated …
Open Science Can Accelerate the Scientific Process…
For some people the change may be too slow to save their life
Josh Sommer – A Remarkable Young ManCo-founder & Executive Director the Chordoma Foundation
http://sagecongress.org/Presentations/Sommer.pdf
Chordoma
• A rare form of brain cancer
• No known drugs• Treatment – surgical
resection followed by intense radiation therapy
http://upload.wikimedia.org/wikipedia/commons/2/2b/Chordoma.JPG
http://sagecongress.org/Presentations/Sommer.pdf
http://sagecongress.org/Presentations/Sommer.pdf
http://sagecongress.org/Presentations/Sommer.pdf
Adapted: http://sagecongress.org/Presentations/Sommer.pdf
Isaac
If I have seen further it is only by standing on the shoulders of giants
Isaac Newton
From Josh’s point of view the climb up just takes too long
> 15 years and > $850M to be more precise
http://sagecongress.org/Presentations/Sommer.pdf
http://sagecongress.org/Presentations/Sommer.pdf
http://fora.tv/2010/04/23/Sage_Commons_Josh_Sommer_Chordoma_Foundation
Other Reasons for Open Science
We Cannot Possibly Read a Fraction of the Papers We Should
Why Open Science Renear & Palmer 2009 Science 325:828-832
Hence We Are Scanning More Reading Less
Renear & Palmer 2009 Science 325:828-832Why Open Science
We Need Tools That Can Automatically Scan the Literature
and Make Sense of It
Automatic Knowledge Discovery for Those with No Time to Read
Immunology Literature
Cardiac DiseaseLiterature
Shared Function
Open Science Does Not Just Mean the Final Publication, But the
Scientific Process Itself
The Scientific Process
Research[Grants]
JournalArticle
ConferencePaper
PosterSession
Reviews
BlogsCommunity Service/Data
Curation
The Truth About the Scientific eLaboratory
• I generate way more negative that positive data, but where is it?
• Content management is a mess– Slides, posters…..– Data, lab notebooks ….– Collaborations, Journal clubs …
• Software is open but where is it?• Farewell is for the data too
Computational Biology Resources Lack Persistence and Usability. PLoS Comp. Biol. 4(7): e1000136
We Need Better Tools to Manage the Scientific Enterprise
Many Great Tools Out There
We Need Scientist Management Tools
Taverna
Our Own Experiment in Capturing the Scientific Process to Make it Open
• Its hard and embarrassing• We have a working prototype using Wings• I can feel the potential productivity gains• Its been a lot of fun and will enable us to
improve our processes regardless of the workflow system itself
Yes The Workflow is Real
Problems with Publishing Workflows
• Workflows are not linear• Workflow : paper is not 1:1• Confidentiality• Peer review• Infrastructure• Community acceptance• Reward system
The Problem at this Time is There is Little Reward for Such
Activities
The Not so Hidden Truth About Science
• Scientists place more emphasis on writing and less on reading
• We are H factor obsessed, but interested in other metrics
• We are driven by (in order): – Grants– Papers– Teaching– Community service
Are There Killer Apps Out There That Could be A Game Changer for Improving Science as Well as
the Reward Process?
Data – Knowledge Integration Perhaps?
Publishing Limitations
• A paper is an artifact of a previous era• It is not the logical end product of eScience,
hence:– Work is omitted– Article vs supplement is a mess– Visualization may be limited– Interaction and enquiry are non-existent– Rich media can help, but are rarely used
Funding Agencies Are Imposing Data Sharing Policies
• From the NSF:
• Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing. See Award & Administration Guide (AAG) Chapter VI.D.4.
1. A link brings up figures from the paper
0. Full text of PLoS papers stored in a database
2. Clicking the paper figure retrievesdata from the PDB which is
analyzed
3. A composite view ofjournal and database
content results
Here is What I Want
1. User clicks on thumbnail2. Metadata and a
webservices call provide a renderable image that can be annotated
3. Selecting a features provides a database/literature mashup
4. That leads to new papers
4. The composite view haslinks to pertinent blocks
of literature text and back to the PDB
1.
2.
3.
4.
The Knowledge and Data Cycle
PLoS Comp. Biol. 2005 1(3) e34
Interactive PDFs etc..
Article of the Future
The Embracing of Rich Media Perhaps?
Yes YouTube Can Increase the Rate of Discovery
Unleash the full power of the Internet
Pubcast – Video Integrated with the Full Text of the Paper
Postercasts
The Semantic Web Perhaps?
Unimaginable Connections Made Automatically Through RDF Descriptions
http://richard.cyganiak.de/2007/10/lod/lod-datasets_2010-09-22_colored.html
Living Documents
The Journal Has A Copy of Record that Provides a Reward
The App Model
General References
• What Do I Want from the Publisher of the Future http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000787
• Fourth Paradigm: Data Intensive Scientific Discovery http://research.microsoft.com/enus/collaboration/fourthparadigm/
What Are Your Ideas To Accelerate the Rate of Scientific
Discovery?
References to Exemplars
• Semantic Biochemical Journal - 2010: Using Utopia
• Article of the Future, Cell, 2009:• Prospect, Royal Society of Chemistry, 2009:• Adventures in Semantic Publishing, Oxford U, 2009:
• The Structured Digital Abstract, Seringhaus/Gerstein, 2008• CWA Nanopublications – 2010
Questions?
pbourne@ucsd.edu