How I failed to do Open Notebook Cheminformatics

16
Department of Bioinformatics - BiGCaT 1 How I failed to do Open Notebook Cheminformatics Egon Willighagen (@egonwillighagen) 14 July 2014, Jean-Claude Bradley Memorial Symposium #jcbms

description

I always believed that with Open Data, Open Source, and Open Standards I was doing the right thing; that it was enough for a better science. However, I have come to the realization that these features are not enough. Surely, they aid Open collaborations, though not even sufficient there, but they fail horribly in the "scientific method." Because while ODOSOS makes work reproducible, it lacks the context needed by scholars to understand what it solved. That is, it details out in much detail how some scientific question is answered, but not what question that was. As such, it fails to follow the established practices in scholarly research. In this presentation I will show how I should have done some of my research, and ponder on reasons why I had not done so.

Transcript of How I failed to do Open Notebook Cheminformatics

Page 1: How I failed to do Open Notebook Cheminformatics

Department of Bioinformatics - BiGCaT 1

How I failed to do

Open Notebook Cheminformatics

Egon Willighagen (@egonwillighagen)14 July 2014, Jean-Claude Bradley Memorial Symposium

#jcbms

Page 2: How I failed to do Open Notebook Cheminformatics

Department of Bioinformatics - BiGCaT 2

Jean-Claude Bradley

Page 3: How I failed to do Open Notebook Cheminformatics

Department of Bioinformatics - BiGCaT 3

“Open Notebook Science”

First response: jealousy

Page 4: How I failed to do Open Notebook Cheminformatics

Department of Bioinformatics - BiGCaT 4

How I failed

• I did Open Science–Strong focus on reproducibility–Open Source, Open Data, Open Standards

(ODOSOS)

• I did share notes...• I wrote up the stories (in a blog)...

Page 5: How I failed to do Open Notebook Cheminformatics

Department of Bioinformatics - BiGCaT 5

Realization

• Scholars need notebooks–They need exact instructions– Just giving the outcome and tools is not

enough

• This applies to cheminformatics too

Page 6: How I failed to do Open Notebook Cheminformatics

Department of Bioinformatics - BiGCaT 6

First notes where during education

Page 7: How I failed to do Open Notebook Cheminformatics

Department of Bioinformatics - BiGCaT 7

ODOSOS

• Software:–The Chemistry Development Kit

• based on CompChem, Jmol and JchemPaint

–Bioclipse, Jmol, ...

• Data–Blue Obelisk Data Repository–RDF translations of knowledge bases

• Standards–eNanoMapper ...

Page 8: How I failed to do Open Notebook Cheminformatics

Department of Bioinformatics - BiGCaT 8

Scribbling...

Page 9: How I failed to do Open Notebook Cheminformatics

Department of Bioinformatics - BiGCaT 9

Scribbline...

Page 10: How I failed to do Open Notebook Cheminformatics

Department of Bioinformatics - BiGCaT 10

Scribbline...

Page 11: How I failed to do Open Notebook Cheminformatics

Department of Bioinformatics - BiGCaT 11

Why important?

• Going back to the original (raw data).• Pedagogical effect• Education (howto's)• Machines care about negative data• If we want to progress, we need to

understand not just global patterns, but the fine details too

Page 12: How I failed to do Open Notebook Cheminformatics

Department of Bioinformatics - BiGCaT 12

Why cheminformatics too?

• Where is the latest RDF of solubility data? Of the melting point data?

• The trust problem applies to algorithms as much as data

• What if...

Page 13: How I failed to do Open Notebook Cheminformatics

Department of Bioinformatics - BiGCaT 13

What I have in mind...

WikiPedia, CC-BY-SA, http://en.wikipedia.org/wiki/Curtin%E2%80%93Hammett_principle

Page 14: How I failed to do Open Notebook Cheminformatics

Department of Bioinformatics - BiGCaT 14

Possible ONS of cheminformatics

• Is this set of atom types covering ChEBI?

• Can we map this metabolomics data to pathways?

• How many CAS registry numbers can I resolve for this data set?

Page 15: How I failed to do Open Notebook Cheminformatics

Department of Bioinformatics - BiGCaT 15

Conclusion

Some patience is needed, but I will start Open Notebook Science.

(And I will push this concept with my Bart and Cristian too.)

Page 16: How I failed to do Open Notebook Cheminformatics

Department of Bioinformatics - BiGCaT 16

Benchmarking / metrics