The Road to Reproducible Computational Research

Post on 16-Apr-2017

22 views 2 download

Transcript of The Road to Reproducible Computational Research

U n i v e r s i t y L O G O

Testing and Developing Tools to Promote the Reproducibility of Computational

Research Andrey Moskalenko

Center for Theoretical and Computational Materials ScienceDaniel Wheeler | Faical Yannick P. Congo

Reproducible Research• Main Areas:

• Computational• Experimental

•Context of the Project• Simulation Management • Sumatra and CoRR• Benchmark Phase Field Problem• Conclusion

Table of Contents

U n i v e r s i t y L O G O

• Context of the Project

• Simulation Management • Sumatra and CoRR• Benchmark Phase Field Problem• Conclusion

Table of Contents

U n i v e r s i t y L O G O

Simulation Management The GoalComputational Research Now

U n i v e r s i t y L O G O

Current available tools

Workflow Tools

Wrapping Tools

Execution Control

Version Control

RobustCommand lineWeb integrationHighly collaborative

Not suitable for capturing execution context

Suitable for recording stable automated executions

Provides log, search and view of execution history

Capture entire simulation context

Version environmentsCollaborative

Not collaborative with current tools

Not robust or ubiquitous

Not suitable for log, search and view of history

Suitable for building pipelines of distinct tasks

Enables a clear division of tasks for non-experts

Black box design for each section of the pipeline

Monolithic in nature encouraging isolated ecosystem of tools

• Context of the Project• Simulation Management

• Sumatra and CoRR• Benchmark Phase Field Problem• Conclusion

Table of Contents

• Context of the Project• Simulation Management

• Sumatra and CoRR• Benchmark Phase Field Problem• Environment and Examples• Conclusion

Table of Contents• Context of the Project• Simulation Management

• Sumatra and CoRR• Benchmark Phase Field Problem • Conclusion

Table of Contents

U n i v e r s i t y L O G O

Sumatra and CoRR

- What is it good for?1

- What are the limitations?

U n i v e r s i t y L O G O

Ease of access

Record structureUser

interface

Sumatra and CoRR

- What is it good for?1

- What are the limitations?

- Autonomous- Local and cloud storage- Continuously recording- Compatible - click-and-run

2

Sumatra and CoRR

dt = 1Equation = f()while elapsed_time is less than desired_duration:

result1 = equation.solve(dt = dt, solver = LinearPCG)result2 = equation.solve(dt = small_dt, solver = LinearPCG)

if result1 does not meet tolerance * result2:decrease dt and solve againelse:increase dt and solve againExtract data

U n i v e r s i t y L O G O

EnvironmentWorkflow Definition Jupyter Notebook aka iPython Notebook

libraries GitHub Cluster

• Context of the Project• Simulation Management • Sumatra and CoRR

•Benchmark Phase Field Problem• Conclusion

Table of Contents

• Context of the Project• Simulation Management • Sumatra and CoRR

•Benchmark Phase Field Problem• Conclusion

Table of Contents

U n i v e r s i t y L O G O

Analysis – phase-field model

2 Test CoRR and Sumatra functionality

1 Performance evaluation

3 Results

1 Performance evaluation

U n i v e r s i t y L O G O

Analysis – phase-field model

Results

U n i v e r s i t y L O G O

Why is reproducibility a difficult task?

• Versions and updates• Legality• Hardware• Python libraries and dependencies • Time drain

U n i v e r s i t y L O G O

• Context of the Project• Simulation Management • Sumatra and CoRR• Benchmark Phase Field Problem

•Conclusion

Table of Contents

U n i v e r s i t y L O G O

Conclusion

2 Problem: CHiMaD benchmark problem Solution: CoRR

1 Could you reproduce our phase-field results?

3 More work to be done in both areas

U n i v e r s i t y L O G O

Acknowledgements

2 MML Thermodynamics and Kinetics group

1MentorsDaniel Wheeler, Ph.DFaical Yannick P. Congo, Ph.D

3 Anushka Dasgupta

4 All who made NIST SURF possible