UMich CI Days: Scaling a code in the human dimension

117
Scaling a Code in the Human Dimension Matthew Turk Columbia University

description

Presentation from 2013 UMich CyberInfrastructure Days. Abstract: As scientists’ needs for computational techniques and tools grow, they cease to be supportable by software developed in isolation. In many cases, these needs are being met by communities of practice, where software is developed by domain scientists to reach pragmatic goals and satisfy distinct and enumerable scientific goals. We describe challenges encountered by communities of scientist developers, as well present techniques that have been successful in growing and engaging communities of practice, specifically in the yt and Enzo communities of computational astrophysics.

Transcript of UMich CI Days: Scaling a code in the human dimension

Page 1: UMich CI Days: Scaling a code in the human dimension

Scaling a Code in theHuman Dimension

Matthew TurkColumbia University

Page 2: UMich CI Days: Scaling a code in the human dimension
Page 3: UMich CI Days: Scaling a code in the human dimension

Barracuda

Page 4: UMich CI Days: Scaling a code in the human dimension

Barracuda Battery

Page 5: UMich CI Days: Scaling a code in the human dimension

Barracuda Battery

Cobra

Page 6: UMich CI Days: Scaling a code in the human dimension

Barracuda Battery

Cobra Quiver

Page 7: UMich CI Days: Scaling a code in the human dimension

Barracuda Battery

Cobra Quiver

Kangaroo

Page 8: UMich CI Days: Scaling a code in the human dimension

Barracuda Battery

Cobra Quiver

Kangaroo Mob

Page 9: UMich CI Days: Scaling a code in the human dimension

Barracuda Battery

Cobra Quiver

Kangaroo Mob

Ancedote

Page 10: UMich CI Days: Scaling a code in the human dimension
Page 11: UMich CI Days: Scaling a code in the human dimension
Page 12: UMich CI Days: Scaling a code in the human dimension
Page 13: UMich CI Days: Scaling a code in the human dimension

What is yt?

Page 14: UMich CI Days: Scaling a code in the human dimension

astro-ph/1011.3514astro-ph/1112.4482

yt-project.org

Page 15: UMich CI Days: Scaling a code in the human dimension
Page 16: UMich CI Days: Scaling a code in the human dimension
Page 17: UMich CI Days: Scaling a code in the human dimension

There are many simulation codes.

Page 18: UMich CI Days: Scaling a code in the human dimension

There are many simulation codes,but there is only one sky.

Page 19: UMich CI Days: Scaling a code in the human dimension

Fully-Supported Semi-Supported In-Progress

EnzoFLASHOrionNyx

Raw DataPiernik

ChomboAthena

ARTRAMSES

GDF

CactusGadgetGAMERPENCIL

Page 20: UMich CI Days: Scaling a code in the human dimension

yt is designed to address physical,not computational,

entities and questions.

Page 21: UMich CI Days: Scaling a code in the human dimension

yt is supposed to get out of the way.

Page 22: UMich CI Days: Scaling a code in the human dimension

from yt.mods import *pf = load("galaxy0030/galaxy0030")p = SlicePlot(pf, 2, "Density", "c", (200, "kpc"))p.show()

Page 23: UMich CI Days: Scaling a code in the human dimension

from yt.mods import *pf = load("galaxy0030/galaxy0030")p = SlicePlot(pf, 2, "Density", "c", (200, "kpc"))p.show()

pf = load("galaxy0030/galaxy0030")

Load from disk, determine IO format, parse parameters, set up mesh, initialize IO, and create

geometric objects.

Page 24: UMich CI Days: Scaling a code in the human dimension

from yt.mods import *pf = load("galaxy0030/galaxy0030")p = SlicePlot(pf, 2, "Density", "c", (200, "kpc"))p.show()p = SlicePlot(pf, 2, "Density", "c", (200, "kpc"))

Identify appropriate subregions of data, mask out overlapping data, convert to CGS,

concatenate, pixelize, and return plot.

Page 25: UMich CI Days: Scaling a code in the human dimension
Page 26: UMich CI Days: Scaling a code in the human dimension
Page 27: UMich CI Days: Scaling a code in the human dimension
Page 28: UMich CI Days: Scaling a code in the human dimension

10-12 g/cc

Page 29: UMich CI Days: Scaling a code in the human dimension
Page 30: UMich CI Days: Scaling a code in the human dimension

Low-Level Data Handling

Objects and Physical Quantities

Canned Analysis Tasks

Page 31: UMich CI Days: Scaling a code in the human dimension

(advanced) Low-Level Data Handling

(intermediate) Objects and Physical Quantities

(beginner) Canned Analysis Tasks

Page 32: UMich CI Days: Scaling a code in the human dimension

"Community"?

Page 33: UMich CI Days: Scaling a code in the human dimension

"Users"

Traditional View of Scientific Development

"Developers"

Page 34: UMich CI Days: Scaling a code in the human dimension

"Users""Developers"

Most Scientific Development

Page 35: UMich CI Days: Scaling a code in the human dimension

"Devusers"

Community of Practice

Page 36: UMich CI Days: Scaling a code in the human dimension

"Developers"Inspection and verificationTracking modificationsSharing informationDoing new and interesting things

Page 37: UMich CI Days: Scaling a code in the human dimension

"Users"Uncritical acceptance of code?

Page 38: UMich CI Days: Scaling a code in the human dimension

"Users""These are the people we give the code

to that don't care how it works."

Page 39: UMich CI Days: Scaling a code in the human dimension
Page 40: UMich CI Days: Scaling a code in the human dimension

Challenges

Page 41: UMich CI Days: Scaling a code in the human dimension

Academic Reward Structure

Page 42: UMich CI Days: Scaling a code in the human dimension

Academic Reward Structure

de facto & de jure

Page 43: UMich CI Days: Scaling a code in the human dimension

de facto & de jure

○ Utilization of developed tools

○ Respect from community

○ Project involvements

○ Invitations and opportunities to speak

Page 44: UMich CI Days: Scaling a code in the human dimension

de facto & de jure

○ Funding

○ Publications

○ Citation count

○ Influence

Page 45: UMich CI Days: Scaling a code in the human dimension

Traditional astrophysics does not favor tool builders.

Page 46: UMich CI Days: Scaling a code in the human dimension

Documentation,

testing,

outreach,

infrastructure development.

Chores

Page 47: UMich CI Days: Scaling a code in the human dimension

Chores

Tasks not fully-aligned with reward structure present great motivational

challenges.

Page 48: UMich CI Days: Scaling a code in the human dimension

Co-opetition

○ Funding

○ Publications

○ Citation count

○ Influence( )

Page 49: UMich CI Days: Scaling a code in the human dimension

The "citation economy" for community codes is broken, and this

disproportionately impacts new and junior contributors.

Page 50: UMich CI Days: Scaling a code in the human dimension

The "citation economy" for community codes is broken, and this

disproportionately impacts new and junior contributors.

(It's bad for us, but even worsefor infrastructure.)

Page 51: UMich CI Days: Scaling a code in the human dimension

How developer community engagement, cohesion, excitement

and energy is affected by funded improvements remains unclear.

Page 52: UMich CI Days: Scaling a code in the human dimension

Feelings of “ownership” are tricky in academia, particularly for young

researchers.

Page 53: UMich CI Days: Scaling a code in the human dimension

Feelings of “ownership” are tricky in academia, particularly for young

researchers.

How can progress, credit, and intellectual priority be balanced?

Page 54: UMich CI Days: Scaling a code in the human dimension

How do the relationships between distributed software developers

interact with academic relationships?

Page 55: UMich CI Days: Scaling a code in the human dimension

Strategies

Page 56: UMich CI Days: Scaling a code in the human dimension

Design the community you want.

Page 57: UMich CI Days: Scaling a code in the human dimension

Design the community you want.

Diversity. Tone. Enthusiasm. Congeniality.

Page 58: UMich CI Days: Scaling a code in the human dimension

Design the community you want.

This is an investment.

Page 59: UMich CI Days: Scaling a code in the human dimension

Design the community you want.

Diversity. Tone. Enthusiasm. Congeniality.

Page 60: UMich CI Days: Scaling a code in the human dimension

Growing diversity in scientific software development is a key

challenge.

Page 61: UMich CI Days: Scaling a code in the human dimension

Technical& Social

Page 62: UMich CI Days: Scaling a code in the human dimension

Technical& Social

Page 63: UMich CI Days: Scaling a code in the human dimension

Reduce barrier to entryTest on every push

Review every changeset

Page 64: UMich CI Days: Scaling a code in the human dimension

Reduce barrier to entryTest on every push

Review every changeset

Everything comes in the box: version control, extensions, sample data, dependencies, and tutorials.

Page 65: UMich CI Days: Scaling a code in the human dimension

CVCS DVCS

Repository

Individuals

Repo

Repo

Repo Repo

Repo

Repo

Page 66: UMich CI Days: Scaling a code in the human dimension

Reduce barrier to entryTest on every push

Review every changeset

Shining Panda for unit tests & small data answer tests, ReadTheDocs.org, and an auto-deployed ReST blog.

Page 67: UMich CI Days: Scaling a code in the human dimension

Reduce barrier to entryTest on every push

Review every changeset

Pull requests and mentoring of new developers, through IRC, mailing list, and code comments.

Page 68: UMich CI Days: Scaling a code in the human dimension

An upstream path...

Page 69: UMich CI Days: Scaling a code in the human dimension

Happy Application

Itch-Scratching

Pull Request Submission

Code Review & Mentoring

Participation

Page 70: UMich CI Days: Scaling a code in the human dimension

Happy Application

Itch-Scratching

Pull Request Submission

Code Review & Mentoring

Participation

Page 71: UMich CI Days: Scaling a code in the human dimension

Accept contributions of data, scripts, images, projects

Page 72: UMich CI Days: Scaling a code in the human dimension
Page 73: UMich CI Days: Scaling a code in the human dimension

Communication

All project business is conducted openly.

Page 74: UMich CI Days: Scaling a code in the human dimension

Immediate

Page 75: UMich CI Days: Scaling a code in the human dimension

Immediate

Page 76: UMich CI Days: Scaling a code in the human dimension

Immediate

Page 77: UMich CI Days: Scaling a code in the human dimension

Low-Latency

Page 78: UMich CI Days: Scaling a code in the human dimension

High-Latency

Page 79: UMich CI Days: Scaling a code in the human dimension

Technical& Social

Page 80: UMich CI Days: Scaling a code in the human dimension

Culture self-propagates.So, it must be seeded directly.

Page 81: UMich CI Days: Scaling a code in the human dimension

The traditional "Open Source" community is not always the right role model.

Page 82: UMich CI Days: Scaling a code in the human dimension

The traditional "Open Source" community is not always the right role model.

Diversity. Tone. Enthusiasm. Congeniality.

Page 83: UMich CI Days: Scaling a code in the human dimension

Foster a community of peers,not a community of elites.

Page 84: UMich CI Days: Scaling a code in the human dimension

H

R

T

Page 85: UMich CI Days: Scaling a code in the human dimension

H

R

T

Fitzpatrick & Collins-Sussman

Page 86: UMich CI Days: Scaling a code in the human dimension

Humility

R

T

Page 87: UMich CI Days: Scaling a code in the human dimension

Humility

Respect

T

Page 88: UMich CI Days: Scaling a code in the human dimension

Humility

Respect

Trust

Page 89: UMich CI Days: Scaling a code in the human dimension

Humility

Page 90: UMich CI Days: Scaling a code in the human dimension

I think there might be a bug in ...

It's like that for a good reason. Don't touch it.

Page 91: UMich CI Days: Scaling a code in the human dimension

I think there might be a bug in ...

It behaves that way because ...

Page 92: UMich CI Days: Scaling a code in the human dimension

Respect

Page 93: UMich CI Days: Scaling a code in the human dimension

I've noticed something is acting strangely with ...

You're probably doing it wrong.

Page 94: UMich CI Days: Scaling a code in the human dimension

I've noticed something is acting strangely with ...

Can you tell us how you'd expect it to act?

Page 95: UMich CI Days: Scaling a code in the human dimension

Trust

Page 96: UMich CI Days: Scaling a code in the human dimension

Letting go...

Page 97: UMich CI Days: Scaling a code in the human dimension

By emphasizing pride over ownership, we've tried to help projects move between people

without smothering through control.

Page 98: UMich CI Days: Scaling a code in the human dimension

By emphasizing pride over ownership, we've tried to help projects move between people

without smothering through control.

This is really difficult.

Page 99: UMich CI Days: Scaling a code in the human dimension

Successes

Page 100: UMich CI Days: Scaling a code in the human dimension
Page 101: UMich CI Days: Scaling a code in the human dimension
Page 102: UMich CI Days: Scaling a code in the human dimension
Page 103: UMich CI Days: Scaling a code in the human dimension

Developed by working astrophysicists.

Page 104: UMich CI Days: Scaling a code in the human dimension

Developed by working astrophysicists.

!

Page 105: UMich CI Days: Scaling a code in the human dimension
Page 106: UMich CI Days: Scaling a code in the human dimension
Page 107: UMich CI Days: Scaling a code in the human dimension

Usage on XSEDE Nautilus

Szczepanski et al, 2012

Page 108: UMich CI Days: Scaling a code in the human dimension
Page 109: UMich CI Days: Scaling a code in the human dimension
Page 110: UMich CI Days: Scaling a code in the human dimension

Moving beyond computational astrophysics.

Page 111: UMich CI Days: Scaling a code in the human dimension

Moving beyond astrophysics.

Page 112: UMich CI Days: Scaling a code in the human dimension

Our first major litmus test, a 3.0 release with major infrastructure

overhauls, approaches.

Page 113: UMich CI Days: Scaling a code in the human dimension

"... it seems likely that significant software contributions to existing scientific software projects are not likely to be rewarded through the traditional reputation economy of science. Together these factors provide a reason to expect the over-production of independent scientific software packages, and the underproduction of collaborative projects in which later academics build on the work of earlier ones."

Howison & Herbsleb (2011)

Page 114: UMich CI Days: Scaling a code in the human dimension

First Workshop on Sustainable Software for Science: Practices and Experiences

SC13: This Sunday!

wssspe.researchcomputing.org.uk

Page 115: UMich CI Days: Scaling a code in the human dimension

Thank you.

Page 116: UMich CI Days: Scaling a code in the human dimension

(Short) Bibliography"What Makes Computational Open Source Libraries Successful?" by Bangerth & Heister

"The Art of Community" by Jono Bacon

"Producing Open Source Software" by Karl Fogel

"Team Geek" by Brian Fitzpatrick & Ben Collins-Sussman

"Organizing Simulation Code Collectives" by Mikaela Sundberg

"Scientific Software Production" by James Howison & James Herbsleb

"Your Community is your Best Feature" by Gina Trapani

"The Proof of the Pudding" by John Allsopp

"Standing Out in the Crowd" by Skud

"Why you should write buggy software" by Brian Granger

Page 117: UMich CI Days: Scaling a code in the human dimension

Tom AbelKenza ArrakiDavid CollinsBrian Crosby

Andrew CunninghamHilary Egan

Nathan GoldbaumWilliam Grey

Markus HaiderCameron Hummels

Christian KarchSteffen KlemerKacper Kowalik

Mike KuhlenEve Lee

Sam LeitnerYuan Li

Chris MaloneJosh Moloney

Chris MoodyAndrew Myers

Jill Naiman

Kaylea NelsonJeff OishiJean-Claude PassyMark RichardsonThomas RobitailleAnna RosenDoug RuddAnthony ScopatzDevin SilviaSam SkillmanStephen SkoryBritton SmithGeoffrey SoCasey StarkElizabeth TaskerStephanie TonnesenSebastian Trujillo-GomezMatthew TurkRick WagnerAndrew WetzelJohn WiseJohn ZuHone