Analysis and Workflows Lesson 12: Analysis and Workflows CC image by Marc_Smith on Flickr.
Workflows and challenges
-
Upload
anita-de-waard -
Category
Documents
-
view
281 -
download
1
Transcript of Workflows and challenges
Data-driven Papers and Grand Challenges
Anita de Waard, [email protected] Disruptive Technologies Director, Elsevier Labs
August 26, 2010
What is the problem?
1. Researchers can’t keep track of their data.
2. Data is not stored in a way that is easy for authors.
What is the problem?
1. Researchers can’t keep track of their data.
2. Data is not stored in a way that is easy for authors.
3. For readers, article text is not linked to the underlying data.
The Vision Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan
1. Research: Each item in the system has metadata (including provenance) and relations to other data items added to it.
metadata
metadata
metadata
metadata
metadata
The Vision Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan
1. Research: Each item in the system has metadata (including provenance) and relations to other data items added to it.
metadata
metadata
metadata
metadata
metadata
2. Workflow: All data items created in the lab are added to a (lab-owned) workflow system.
The Vision Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan
1. Research: Each item in the system has metadata (including provenance) and relations to other data items added to it.
metadata
metadata
metadata
metadata
metadata
2. Workflow: All data items created in the lab are added to a (lab-owned) workflow system.
Rats were subjected to two grueling tests(click on fig 2 to see underlying data). These results suggest that the neurological pain pro-
3. Authoring: A paper is written in an authoring tool which can pull data with provenance from the workflow tool in the appropriate representation into the document.
The Vision Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan
1. Research: Each item in the system has metadata (including provenance) and relations to other data items added to it.
metadata
metadata
metadata
metadata
metadata
2. Workflow: All data items created in the lab are added to a (lab-owned) workflow system.
4. Editing and review: Once the co-authors agree, the paper is ‘exposed’ to the editors, who in turn expose it to reviewers. Reports are stored in the authoring/editing system, the paper gets updated, until it is validated.
Review
EditRevise
Rats were subjected to two grueling tests(click on fig 2 to see underlying data). These results suggest that the neurological pain pro-
3. Authoring: A paper is written in an authoring tool which can pull data with provenance from the workflow tool in the appropriate representation into the document.
The Vision Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan
1. Research: Each item in the system has metadata (including provenance) and relations to other data items added to it.
metadata
metadata
metadata
metadata
metadata
5. Publishing and distribution: When a paper is published, a collection of validated information is exposed to the world. It remains connected to its related data item, and its heritage can be traced.
2. Workflow: All data items created in the lab are added to a (lab-owned) workflow system.
4. Editing and review: Once the co-authors agree, the paper is ‘exposed’ to the editors, who in turn expose it to reviewers. Reports are stored in the authoring/editing system, the paper gets updated, until it is validated.
Review
EditRevise
Rats were subjected to two grueling tests(click on fig 2 to see underlying data). These results suggest that the neurological pain pro-
3. Authoring: A paper is written in an authoring tool which can pull data with provenance from the workflow tool in the appropriate representation into the document.
Some other publisher
6. User applications: distributed applications run on this ‘exposed data’ universe.
The Vision Work done with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan
1. Research: Each item in the system has metadata (including provenance) and relations to other data items added to it.
metadata
metadata
metadata
metadata
metadata
5. Publishing and distribution: When a paper is published, a collection of validated information is exposed to the world. It remains connected to its related data item, and its heritage can be traced.
2. Workflow: All data items created in the lab are added to a (lab-owned) workflow system.
4. Editing and review: Once the co-authors agree, the paper is ‘exposed’ to the editors, who in turn expose it to reviewers. Reports are stored in the authoring/editing system, the paper gets updated, until it is validated.
Review
EditRevise
Rats were subjected to two grueling tests(click on fig 2 to see underlying data). These results suggest that the neurological pain pro-
3. Authoring: A paper is written in an authoring tool which can pull data with provenance from the workflow tool in the appropriate representation into the document.
What is needed to get there? Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendly
What is needed to get there? Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendlyAuthoring and reviewing tools: that enable use of rich and
provenance-tracked elements
What is needed to get there? Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendlyAuthoring and reviewing tools: that enable use of rich and
provenance-tracked elementsMetadata standards: Standards that allow exchange of
information on any knowledge item created in a lab, including provenance/privacy/IPR rights
What is needed to get there? Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendlyAuthoring and reviewing tools: that enable use of rich and
provenance-tracked elementsMetadata standards: Standards that allow exchange of
information on any knowledge item created in a lab, including provenance/privacy/IPR rights
Semantic/Linked Data XML repositories.
What is needed to get there? Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendlyAuthoring and reviewing tools: that enable use of rich and
provenance-tracked elementsMetadata standards: Standards that allow exchange of
information on any knowledge item created in a lab, including provenance/privacy/IPR rights
Semantic/Linked Data XML repositories. Publishing systems that are application servers
What is needed to get there? Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendlyAuthoring and reviewing tools: that enable use of rich and
provenance-tracked elementsMetadata standards: Standards that allow exchange of
information on any knowledge item created in a lab, including provenance/privacy/IPR rights
Semantic/Linked Data XML repositories. Publishing systems that are application serversSocial change: Scientists store, track and annotate their
work.
What is needed to get there? Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendlyAuthoring and reviewing tools: that enable use of rich and
provenance-tracked elementsMetadata standards: Standards that allow exchange of
information on any knowledge item created in a lab, including provenance/privacy/IPR rights
Semantic/Linked Data XML repositories. Publishing systems that are application serversSocial change: Scientists store, track and annotate their
work.
tool builders
What is needed to get there? Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendlyAuthoring and reviewing tools: that enable use of rich and
provenance-tracked elementsMetadata standards: Standards that allow exchange of
information on any knowledge item created in a lab, including provenance/privacy/IPR rights
Semantic/Linked Data XML repositories. Publishing systems that are application serversSocial change: Scientists store, track and annotate their
work.
tool builders
tool builders
What is needed to get there? Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendlyAuthoring and reviewing tools: that enable use of rich and
provenance-tracked elementsMetadata standards: Standards that allow exchange of
information on any knowledge item created in a lab, including provenance/privacy/IPR rights
Semantic/Linked Data XML repositories. Publishing systems that are application serversSocial change: Scientists store, track and annotate their
work.
tool builders
standards bodies
tool builders
What is needed to get there? Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendlyAuthoring and reviewing tools: that enable use of rich and
provenance-tracked elementsMetadata standards: Standards that allow exchange of
information on any knowledge item created in a lab, including provenance/privacy/IPR rights
Semantic/Linked Data XML repositories. Publishing systems that are application serversSocial change: Scientists store, track and annotate their
work.
tool builders
standards bodiespublishers
tool builders
What is needed to get there? Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendlyAuthoring and reviewing tools: that enable use of rich and
provenance-tracked elementsMetadata standards: Standards that allow exchange of
information on any knowledge item created in a lab, including provenance/privacy/IPR rights
Semantic/Linked Data XML repositories. Publishing systems that are application serversSocial change: Scientists store, track and annotate their
work.
tool builders
standards bodiespublisherspublishers
tool builders
What is needed to get there? Workflow tools: Linked-data-based workflow tools for all
sciences: scalable, safe, and user-friendlyAuthoring and reviewing tools: that enable use of rich and
provenance-tracked elementsMetadata standards: Standards that allow exchange of
information on any knowledge item created in a lab, including provenance/privacy/IPR rights
Semantic/Linked Data XML repositories. Publishing systems that are application serversSocial change: Scientists store, track and annotate their
work.
tool builders
standards bodies
institutes, funding bodies, individuals
publisherspublishers
tool builders
A. Workflow tools are emerging
http://MyExperiment.org
A. Workflow tools are emerging
http://MyExperiment.org
http://VisTrails.org
A. Workflow tools are emerging
http://wings.isi.edu/
http://MyExperiment.org
http://VisTrails.org
SWAN Semantic Relationships
MSWORD file
Excel file
personpublication
publication
Claim
commentPrivate makes hasEvidence
hasEvidence
describes
describes
annotates
authoredBy
shareWith
authorOf
B. Authoring ‘ecosystems’: e.g., SWAN
Slide by Tim Clark
SWAN Semantic Relationships
MSWORD file
Excel file
personpublication
publication
Claim
commentPrivate makes hasEvidence
hasEvidence
describes
describes
annotates
authoredBy
shareWith
authorOf
B. Authoring ‘ecosystems’: e.g., SWAN
Slide by Tim Clark
person
group
hypothesis Claim
Claim
Public
makes
makes
hasEvidence
hasEvidence
hasEvidence
PDFs
publication
publication
publication
gene
comment
concept
describes
annotates
annotates
discussedIn
authoredBy
shareWith
foaf:person rdf:Type
June 1, 2010
Atomic
http://www.ht.org/foaf.rdf#me
pav:createdOn
pav:createdBy
rdf:Type
http://anyurl.com/sf_pat01.htmlann:annotates
ann:contextonDocument
InitEndCornerSelector
ImageSelector
rdf:Type
rdfs:SubClassOf(304, 507)
(380, 618)
init
end
Other annotations on the same document:1. Atomic annotation on image (tag: “hematoma”)2. General annotation (tag: “injury”)
Other annotations on similar documents:1. General annotation (tag: “skull fracture”)
hasTag
Tag
Linear skull fracture
tag FMA:skull
hasTopic
C. Example of Metadata: Harvard’s Annotation Ontology
Slide by Tim Clark
D. Linked Data at Elsevier
<ce:section id=#123>
said @anita on May 31 2010
mice like cheesethis says
but we all know she was jetlagged then
D. Linked Data at Elsevier
<ce:section id=#123>
said @anita on May 31 2010
mice like cheesethis says
but we all know she was jetlagged then
D. Linked Data at Elsevier
<ce:section id=#123>
said @anita on May 31 2010
immutable, $$, proprietary
mice like cheesethis says
dynamic, personal, task-driven, - open?
but we all know she was jetlagged then
D. Linked Data at Elsevier
<ce:section id=#123>
said @anita on May 31 2010
immutable, $$, proprietary
mice like cheesethis says
• 2010 - 2011: Try to gather resources, current leaders, etc. for ‘Future of Research Communication’ effort
F. Social Change. Some next Steps:
• 2010 - 2011: Try to gather resources, current leaders, etc. for ‘Future of Research Communication’ effort–Fall 2010: Develop virtual community (with Harvard)
F. Social Change. Some next Steps:
• 2010 - 2011: Try to gather resources, current leaders, etc. for ‘Future of Research Communication’ effort–Fall 2010: Develop virtual community (with Harvard)–August 2011: Dagstuhl Workshop:
F. Social Change. Some next Steps:
• 2010 - 2011: Try to gather resources, current leaders, etc. for ‘Future of Research Communication’ effort–Fall 2010: Develop virtual community (with Harvard)–August 2011: Dagstuhl Workshop:
• Involve key people (include funding bodies, libraries, institutions) to see where bottlenecks are
F. Social Change. Some next Steps:
• 2010 - 2011: Try to gather resources, current leaders, etc. for ‘Future of Research Communication’ effort–Fall 2010: Develop virtual community (with Harvard)–August 2011: Dagstuhl Workshop:
• Involve key people (include funding bodies, libraries, institutions) to see where bottlenecks are
• Write white paper, implement
F. Social Change. Some next Steps:
• 2010 - 2011: Try to gather resources, current leaders, etc. for ‘Future of Research Communication’ effort–Fall 2010: Develop virtual community (with Harvard)–August 2011: Dagstuhl Workshop:
• Involve key people (include funding bodies, libraries, institutions) to see where bottlenecks are
• Write white paper, implement• 2011: ICCS ‘Executable Paper Challenge’?
F. Social Change. Some next Steps:
Scope: Tools and processes to:
- Improve the process of creating, reviewing and editing scientific content
- Interpret, visualize or connect science knowledge
- Provide tools/ideas for measuring the impact of these improvements.
Scope: Tools and processes to:
- Improve the process of creating, reviewing and editing scientific content
- Interpret, visualize or connect science knowledge
- Provide tools/ideas for measuring the impact of these improvements.
June 2008: 71 Submissions from 15 countries.
Scope: Tools and processes to:
- Improve the process of creating, reviewing and editing scientific content
- Interpret, visualize or connect science knowledge
- Provide tools/ideas for measuring the impact of these improvements.
June 2008: 71 Submissions from 15 countries.August 2008: 10 Semi-finalists teams, access to:
- 500,000 full text articles - Plus EMTREE, EmBase, Scopus
- Created tool/demo- Presented to the Judges
- Wrote a paper (accepted for JWeb Semantics)
Scope: Tools and processes to:
- Improve the process of creating, reviewing and editing scientific content
- Interpret, visualize or connect science knowledge
- Provide tools/ideas for measuring the impact of these improvements.
June 2008: 71 Submissions from 15 countries.August 2008: 10 Semi-finalists teams, access to:
- 500,000 full text articles - Plus EMTREE, EmBase, Scopus
- Created tool/demo- Presented to the Judges
- Wrote a paper (accepted for JWeb Semantics)April 2009: Judges selected 4 Finalist teams.
Scope: Tools and processes to:
- Improve the process of creating, reviewing and editing scientific content
- Interpret, visualize or connect science knowledge
- Provide tools/ideas for measuring the impact of these improvements.
June 2008: 71 Submissions from 15 countries.August 2008: 10 Semi-finalists teams, access to:
- 500,000 full text articles - Plus EMTREE, EmBase, Scopus
- Created tool/demo- Presented to the Judges
- Wrote a paper (accepted for JWeb Semantics)April 2009: Judges selected 4 Finalist teams.And the winners were:
Scope: Tools and processes to:
- Improve the process of creating, reviewing and editing scientific content
- Interpret, visualize or connect science knowledge
- Provide tools/ideas for measuring the impact of these improvements.
June 2008: 71 Submissions from 15 countries.August 2008: 10 Semi-finalists teams, access to:
- 500,000 full text articles - Plus EMTREE, EmBase, Scopus
- Created tool/demo- Presented to the Judges
- Wrote a paper (accepted for JWeb Semantics)April 2009: Judges selected 4 Finalist teams.And the winners were:
Scope: Tools and processes to:
- Improve the process of creating, reviewing and editing scientific content
- Interpret, visualize or connect science knowledge
- Provide tools/ideas for measuring the impact of these improvements.
June 2008: 71 Submissions from 15 countries.August 2008: 10 Semi-finalists teams, access to:
- 500,000 full text articles - Plus EMTREE, EmBase, Scopus
- Created tool/demo- Presented to the Judges
- Wrote a paper (accepted for JWeb Semantics)April 2009: Judges selected 4 Finalist teams.And the winners were: