Delivering on the promise of a chemistry data repository for the world

36
Delivering on the promise of a chemistry data repository for the world Antony Williams Going Native Panel Discussion at the Microsoft eScience Workshop 0000-0002-2668- 4821

description

This presentation was given as a part of the Microsoft eScience panel discussion in Sao Paulo, Brazil. The panel discussion was in regards to Going Native, a reference to a quote from Jim Gray along the lines of “in order to really understand the computing needs of a scientist you have to go native”. Jim himself did this, immersing himself in astronomy to build what would become the WorldWide Telescope. Bridging the gap between experimental scientists and the computing that underpins their discoveries is an ongoing challenge for eScience. The panel explored what it means to go native and gave examples of where they have seen this work well and shared lesson’s learned from working in this way.

Transcript of Delivering on the promise of a chemistry data repository for the world

Page 1: Delivering on the promise of a chemistry data repository for the world

Delivering on the promise of a chemistry data repository

for the world

Antony WilliamsGoing Native Panel Discussion at the Microsoft eScience Workshop

0000-0002-2668-4821

Page 2: Delivering on the promise of a chemistry data repository for the world

A Question to Start…

• Who in the room has an ORCID?

Page 3: Delivering on the promise of a chemistry data repository for the world

New Horizons….

• Let’s map together all historical chemistry data and build systems to integrate it

• Heck, let’s integrate chemistry and biology data and add in disease data too

• Let’s model the data and see if we can extract new relationships – quantitative and qualitative

• Let’s take what we learn from historical data and build better solutions for modern data

• Let’s make it all available on the web…

Page 4: Delivering on the promise of a chemistry data repository for the world
Page 5: Delivering on the promise of a chemistry data repository for the world

What about this….

• We’re going to map the world

• We’re going to take photos of as many places as we can and link them together

• We’ll let people annotate and curate the map

• Then let’s make it available free on the web

• We’ll make it available for decision making

• Put it on Mobile Devices, give it away…

Page 6: Delivering on the promise of a chemistry data repository for the world

Chemistry data is of value?

• Reference databases generate hundreds of millions of dollars/euros per year

• So much data generated that could go public

• Maybe 5% of all data generated is published

• There is no “Journal of Failed Experiments”

• Funding agencies start to demand Open Data

• Scientists want funding but also recognition

Page 7: Delivering on the promise of a chemistry data repository for the world

A shift to Openness

Page 8: Delivering on the promise of a chemistry data repository for the world

Open Data is here…

Page 9: Delivering on the promise of a chemistry data repository for the world

Chemistry data is of value?

• Reference databases generate hundreds of millions of dollars/euros per year

• So much data generated that could go public

• Maybe 5% of all data generated is published

• There is no “Journal of Failed Experiments”

• Funding agencies start to demand Open Data

• Scientists want funding but also recognition

• …so who will fund and build the platforms?

Page 10: Delivering on the promise of a chemistry data repository for the world

Going Native… speaka da lingo

Chemists clearly benefit from accessing data

Page 11: Delivering on the promise of a chemistry data repository for the world
Page 12: Delivering on the promise of a chemistry data repository for the world

What we found…

• Data quality on the internet can be very poor

• Everyone wants access to high quality data but very few are willing to contribute

• The primary concerns for contributors• It needs to be easy• Data licensing• Recognition for contributions

Page 13: Delivering on the promise of a chemistry data repository for the world

Recognition: need to have Impact

Page 14: Delivering on the promise of a chemistry data repository for the world

Quantitating scientists?

Page 15: Delivering on the promise of a chemistry data repository for the world

National Information Standards Organization and “Altmetrics”

http://www.niso.org/apps/group_public/download.php/13295/niso_altmetrics_white_paper_draft_v4.pdf

Page 16: Delivering on the promise of a chemistry data repository for the world

Research Outputs

• Blogs

• Research datasets

• Scientific software

• Posters and presentations at conferences

• Electronic theses and dissertations

• Performances in film and audio

• Lectures, online classes and teaching activities

Page 17: Delivering on the promise of a chemistry data repository for the world

Recognizing Contribution

• In order to encourage participation maybe we need to provide recognition of impact

• How do we measure impact for:• Performing peer review?• Contributions to more “public platforms”?...

Page 18: Delivering on the promise of a chemistry data repository for the world

Christmas Curating Wikipedia

Page 19: Delivering on the promise of a chemistry data repository for the world

Wikipedia Chemboxes

• http://en.wikipedia.org/wiki/Glucose

19

Page 20: Delivering on the promise of a chemistry data repository for the world

Three days of discussion

Page 21: Delivering on the promise of a chemistry data repository for the world

Three days of discussion

• If you want to understand Wikipedia definitely Go Native and get involved!

Page 22: Delivering on the promise of a chemistry data repository for the world

Does ONE bond matter???

Page 23: Delivering on the promise of a chemistry data repository for the world

A short intro to chirality

Page 24: Delivering on the promise of a chemistry data repository for the world

A short intro to chirality

Page 25: Delivering on the promise of a chemistry data repository for the world

Educating chemists in data

• Chemists are more likely to know basic HTML over data formats in chemistry

• Even international standards for data interchange and standardization are unknown

• Standards are ideal for computers to handle

Page 26: Delivering on the promise of a chemistry data repository for the world

Can we MAKE Quality Data?

• We are building systems for everyone to validate and standardize their data

Page 27: Delivering on the promise of a chemistry data repository for the world

Where to host research data?

• Containers for chemical compounds, chemical reactions, analytical data, tabular data, etc.

• Algorithms for data validation and standardization

• Domain specific search technologies

• A platform for modeling data

• Progressing the RSC Data Repository…

Page 28: Delivering on the promise of a chemistry data repository for the world

Compounds

Page 29: Delivering on the promise of a chemistry data repository for the world

Reactions

Page 30: Delivering on the promise of a chemistry data repository for the world

Analytical data

Page 31: Delivering on the promise of a chemistry data repository for the world

Generating models from data

Page 32: Delivering on the promise of a chemistry data repository for the world

New Horizons….are here

• Let’s map together all historical chemistry data and build systems to integrate it

• Heck, let’s integrate chemistry and biology data and add in disease data too

• Let’s model the data and see if we can extract new relationships – quantitative and qualitative

• Let’s take what we learn from historical data and build better solutions for modern data

• Let’s make it all available on the web…

Page 33: Delivering on the promise of a chemistry data repository for the world

So we DON’T have to do this…

Page 34: Delivering on the promise of a chemistry data repository for the world

ORIGINAL FIGURE

EXTRACTED FIGURE

Page 35: Delivering on the promise of a chemistry data repository for the world

The path forward

• Mesh and aggregate published data

• Encourage deposition of RESEARCH data – that will never be published

• Provide open APIs for data access

• Educate chemists in digital literacy

• Funding agencies should mandate data access

• Collaboration is key – don’t do it alone

Page 36: Delivering on the promise of a chemistry data repository for the world

Thank you

Email: [email protected]: 0000-0002-2668-4821 Twitter: @ChemConnectorPersonal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams