Delivering on the promise of a chemistry data repository for the world
-
Upload
antony-williams-chemconnector-orcid-0000-0002-2668-4821 -
Category
Science
-
view
1.029 -
download
2
description
Transcript of Delivering on the promise of a chemistry data repository for the world
Delivering on the promise of a chemistry data repository
for the world
Antony WilliamsGoing Native Panel Discussion at the Microsoft eScience Workshop
0000-0002-2668-4821
A Question to Start…
• Who in the room has an ORCID?
New Horizons….
• Let’s map together all historical chemistry data and build systems to integrate it
• Heck, let’s integrate chemistry and biology data and add in disease data too
• Let’s model the data and see if we can extract new relationships – quantitative and qualitative
• Let’s take what we learn from historical data and build better solutions for modern data
• Let’s make it all available on the web…
What about this….
• We’re going to map the world
• We’re going to take photos of as many places as we can and link them together
• We’ll let people annotate and curate the map
• Then let’s make it available free on the web
• We’ll make it available for decision making
• Put it on Mobile Devices, give it away…
Chemistry data is of value?
• Reference databases generate hundreds of millions of dollars/euros per year
• So much data generated that could go public
• Maybe 5% of all data generated is published
• There is no “Journal of Failed Experiments”
• Funding agencies start to demand Open Data
• Scientists want funding but also recognition
A shift to Openness
Open Data is here…
Chemistry data is of value?
• Reference databases generate hundreds of millions of dollars/euros per year
• So much data generated that could go public
• Maybe 5% of all data generated is published
• There is no “Journal of Failed Experiments”
• Funding agencies start to demand Open Data
• Scientists want funding but also recognition
• …so who will fund and build the platforms?
Going Native… speaka da lingo
Chemists clearly benefit from accessing data
What we found…
• Data quality on the internet can be very poor
• Everyone wants access to high quality data but very few are willing to contribute
• The primary concerns for contributors• It needs to be easy• Data licensing• Recognition for contributions
Recognition: need to have Impact
Quantitating scientists?
National Information Standards Organization and “Altmetrics”
http://www.niso.org/apps/group_public/download.php/13295/niso_altmetrics_white_paper_draft_v4.pdf
Research Outputs
• Blogs
• Research datasets
• Scientific software
• Posters and presentations at conferences
• Electronic theses and dissertations
• Performances in film and audio
• Lectures, online classes and teaching activities
Recognizing Contribution
• In order to encourage participation maybe we need to provide recognition of impact
• How do we measure impact for:• Performing peer review?• Contributions to more “public platforms”?...
Christmas Curating Wikipedia
Wikipedia Chemboxes
• http://en.wikipedia.org/wiki/Glucose
19
Three days of discussion
Three days of discussion
• If you want to understand Wikipedia definitely Go Native and get involved!
Does ONE bond matter???
A short intro to chirality
A short intro to chirality
Educating chemists in data
• Chemists are more likely to know basic HTML over data formats in chemistry
• Even international standards for data interchange and standardization are unknown
• Standards are ideal for computers to handle
Can we MAKE Quality Data?
• We are building systems for everyone to validate and standardize their data
Where to host research data?
• Containers for chemical compounds, chemical reactions, analytical data, tabular data, etc.
• Algorithms for data validation and standardization
• Domain specific search technologies
• A platform for modeling data
• Progressing the RSC Data Repository…
Compounds
Reactions
Analytical data
Generating models from data
New Horizons….are here
• Let’s map together all historical chemistry data and build systems to integrate it
• Heck, let’s integrate chemistry and biology data and add in disease data too
• Let’s model the data and see if we can extract new relationships – quantitative and qualitative
• Let’s take what we learn from historical data and build better solutions for modern data
• Let’s make it all available on the web…
So we DON’T have to do this…
ORIGINAL FIGURE
EXTRACTED FIGURE
The path forward
• Mesh and aggregate published data
• Encourage deposition of RESEARCH data – that will never be published
• Provide open APIs for data access
• Educate chemists in digital literacy
• Funding agencies should mandate data access
• Collaboration is key – don’t do it alone
Thank you
Email: [email protected]: 0000-0002-2668-4821 Twitter: @ChemConnectorPersonal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams