Beyond the PDF demo-presentation

8

Click here to load reader

description

How difficult is to reproduce an experiment, even when the input results and the outputs are available online? In this work we quantify it.

Transcript of Beyond the PDF demo-presentation

Page 1: Beyond the PDF demo-presentation

Beyond the PDF 2Demo and Supporting

Material

Daniel Garijo Verdejo

Ontology Engineering Group. Laboratorio de Inteligencia Artificial

Departamento de Inteligencia Artificial

Facultad de Informática

Universidad Politécnica de Madrid

Page 2: Beyond the PDF demo-presentation

2

The TB-Drugome

Have you ever heard about the TB-Drugome?

http://funsite.sdsc.edu/drugome/TB/

•Workflow presented on the previous Beyond the PDF Workshop

•Constructed to detect whether any approved drug could be a candidate to cure the TB.

•All materials used for the experiment are presented in the website.

•Can it be reproduced?? How much would it cost??

Page 3: Beyond the PDF demo-presentation

3

The Original Workflow

Main Functionality

Page 5: Beyond the PDF demo-presentation

2

Differences with reproduced results?

Drug Name Connections

Connections-

solvedStructures

Alitretinoin 98 14

Levothyroxine 63 14

Methotrexate 48 10

Estradiol 38 10

Rifampin 34 6

4-Hydroxytamoxifen 33 10

Amantadine 32 0

Raloxifene 28 10

Propofol 24 3

Indinavir 23 2

Ritonavir 22 7

Darunavir 22 5

Lopinavir 22 4

Penicillamine 20 5

Nelfinavir 20 3

Drug Name Connections

Connection-

SolvedStructures

Tretinoin 257 46

Levothyroxine 173 36

Methotrexate 156 32

4-Hydroxytamoxifen 115 25

Estradiol 98 20

Amantadine 79 1

Rifampin 78 13

Raloxifene 75 18

Propofol 54 5

Indinavir 51 14

Penicillamine 44 10

Daunorubicin 44 12

Triclosan 42 5

Darunavir 40 15

Desoxycorticosterone 39 12

Diethylstilbestrol 39 7

Amprenavir 38 14

Tadalafil 36 7

Pemetrexed 35 17

Lopinavir 35 10

Saquinavir 34 13

Indomethacin 34 8

Bepridil 33 14

Testosterone 32 14

Bexarotene 32 10

Imatinib 32 12

Ritonavir 32 10

Nelfinavir 28 9

Original

Reproduced

Page 6: Beyond the PDF demo-presentation

6

Measuring the reproducibility effort

Tasks Time (days)

Familiarization with workflow and running software 20

SMAP steps 4

SMAP result sorter steps 1

Merger steps 0.5

Get significant results 0.5

FATCAT URL checker 1

FATCAT step 0.5

Remove significant pairs 0.5

Create clip files 1

Create ideal ligands 1

Ideal ligand checker 1

Autodock Vina 2

Data visualization steps 2

TOTAL 35 days

Page 7: Beyond the PDF demo-presentation

7

Linked Data Publication

Workflow executed with the Wings workflow engine

Provenance and specification published according the PROV standard and the Open Provenance Model. All data is published under the Linked Data Principles automatically.

The workflow is part of a corpus (LINK) which has been exposed as Linked Data, along with its provenance and specifications.

LINK: Querying the LD workflow templates

LINK: Exposing workflows as Wikipages automatically

Page 8: Beyond the PDF demo-presentation

Beyond the PDF 2Demo and Supporting

Material

Daniel Garijo Verdejo

Ontology Engineering Group. Laboratorio de Inteligencia Artificial

Departamento de Inteligencia Artificial

Facultad de Informática

Universidad Politécnica de Madrid