ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

73
An Ecosystemic and Socio-Technical View on Software Maintenance & Evolution Tom Mens @tom_mens COMPLEXYS Research Institute University of Mons, Belgium

Transcript of ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Page 1: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

An Ecosystemicand Socio-TechnicalView on Software

Maintenance & Evolution

Tom Mens @tom_mensCOMPLEXYS Research Institute

University of Mons, Belgium

Page 2: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution
Page 3: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution
Page 4: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

-1999 PhD @VUB

1999-2003Postdoc @VUB

2003-now(full) professor

Page 5: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

OO design &

refactoring

MDSE, model transformation

empirical research of

software ecosystems

2004 20081994- 2004

1998- 2004

2010- now

Page 6: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Research Collaborators

Page 7: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Research Context

2012-2017 ongoing research project“Ecological Studies of Open Source Software Ecosystems”

- Interdisciplinary research- Use ideas from biological ecology to understand and

improve evolution of software ecosystems

A software ecosystem is a collection of software projects that are developed

and evolve together in the same environment.

Mircea Lungu(PhD, 2008)

Page 8: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

8

Software Ecosystem Examples

Gnome

CRAN

Debian Ubuntu KDE

JavaScript Ruby

Page 9: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

When things go wrong…

Page 10: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

CRAN

Credits: http://www.designandanalytics.com/cran-gephi/

Package dependency graph

> 9K active packages> 21K dependenciesin April 2016

Page 11: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

CRAN

• Increasing number of R packages hosted on GitHub“non-transparent nature of the CRAN submission / rejection process”“CRAN […] is revealing some limitations of the current design. One such problem is the general lack of dependency versioning in the infrastructure.”

• Problems with breaking dependencies“It is more and more of a pain if the package I’m depending on breaks”“One recent example was the forced roll-back of the ggplot2 update to version 0.9.0, because the introduced changes caused several other packages to break.”

Decan et al. “When GitHub Meets CRAN: An Analysis of Inter-Repository Package Dependency Problems.” SANER 2016

Page 12: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

JavaScript> 317K packages > 728K dependencies in June 2016

Page 13: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

JavaScript

• Deliberate desire to distribute micropackages• Lots of dependencies to micropackagesExample: isarray

(150 direct, 77K transitive in-deps on Aug 2016)

var toString = {}.toString;module.exports = Array.isArray || function (arr) { return toString.call(arr) == '[object Array]’;};

David Haney’s code blog, March 2016http://www.haneycodes.net/npm-left-pad-have-we-forgotten-how-to-program/

Page 14: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Example: leftpad

Page 15: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

• Package leftpadfunction leftpad (str, len, ch) {  str = String(str);  var i = -1;  if (!ch && ch !== 0) ch = ' ';  len = len - str.length;  while (++i < len) { str = ch + str; }  return str;}

• What happened?– Its developer unpublished all his modules from npm

“This impacted many thousands of projects. [...] We began observing hundreds of failures per minute, as dependent projects – and their dependents, and their dependents... – all failed when requesting the now-unpublished package.”

http://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm

Example: leftpad

Page 16: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Departure of acentral contributor

• All bug handling became concentrated in 1 contributor• Contributor suddenly left project, being dissatisfied• Lasting negative impact on bug handling performance

Zanetti et al. “The rise and fall of a central contributor: Dynamics of social organization and performance in the Gentoo community.” CHASE 2013

Page 17: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

17

Strict policy and tools for ensuring backward compatibility• “Prime Directive: When evolving the Component API

from release to release, do not break existing clients”

Bogart et al. “How to break an API: Cost negotiation and community values in three software ecosystems.” FSE 2016

Page 18: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

18

May lead to stagnation and drive away developers – Coordination around synchronized yearly releases slows

down development

“If you have hip things, then you get people who create new APIs on top of that […] These things don’t happen on the Eclipse platform anymore.” “you have to be very patient and know who to talk with […] in order to get your patches accepted, and I think it’s very intimidating for some new people to come on.”

Bogart et al. “How to break an API: Cost negotiation and community values in three software ecosystems.” FSE 2016

Page 19: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Socio-Technical View

20

• Software ecosystems suffer from problems because of technical factors, social reasons, or both.

• A socio-technical viewis therefore essential for software ecosystem evolution research.

Page 20: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Socio-Technical View

• Socio-technical analyses can benefit frommixed method research– Combine quantitative and qualitative methods

into a single study• Empirical analysis of objective data• user surveys and interviews

– Exploiting their complementarity increases confidence of the findings

Johnson et al. Mixed methods research: A research paradigm whose time has come. Educational Researcher 33(7): 14–26, 2004

Page 21: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Software Ecosystem (SECO)Research Challenges

Understanding SECOs• How are SECOs structured?• What are their tools, habits, values, boundaries?• How do they emerge and evolve over time?• What are the mechanisms driving their dynamics?• How do different SECOs compare?• How to face technical challenges?

Serebrenik et al. “Challenges in Software Ecosystems Research.” IWSECO-WEA 2015

Page 22: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Software EcosystemResearch Challenges

Supporting SECO communities• How can they be made more sustainable and

resilient?• How can we predict their evolution?• How can we improve the SECO?

– In terms of productivity, quality, diversity, maintainability, survival, popularity, …

Serebrenik et al. “Challenges in Software Ecosystems Research.” IWSECO-WEA 2015

Page 23: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Supporting SECOsIncreasing resilience & sustainability

24

Can the SECO• resist to major disturbances?• return to a stable equilibrium after a major

disturbance?

Possible approach:• Estimate, predict and reduce risk of bus factor

Page 24: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Bus factorSocial view

Specific activity concentrated in few persons.Examples:

– Single responsible for bug handling in Gentoo– Only one developer knows some part of the code

Page 25: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Bus factorTechnical view

Too much software components depend on a single software component.

– Makes components more brittle to future changes– npm leftpad example

Page 26: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Bus factor

Active area of research

At least 4 GitHub projects compute (social) bus factor.

Cosentino et al. “Assessing the bus factor of Git repositories.” SANER 2015

Avelino et al. “A novel approach for estimating truck factors.” ICPC 2016

Page 27: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Bus factor

Experimental support on GitHubhttps://libraries.io/bus-factor

Page 28: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Bus factor

https://dependencyci.com

Page 29: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Supporting SECOsImproving quality

By increasing technical wealththrough reducing technical debt

“a concept in programming that reflects the extra development work that arises when code that is easy to implement in the short run is used instead of applying the best overall solution”(Ward Cunningham, 1992)

http://legacycoderocks.libsyn.com/technical-wealth-with-declan-wheelan

Page 30: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Implementation of SQALE model in SonarQube

Page 31: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Supporting SECOsImproving quality

Social view: Reducing social debt “Unforeseen project cost connected to sub-optimal organizational-social structures”

Page 32: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Supporting SECOsImproving quality

Reducing social debt by removing community smells– Organisational silo

• High decoupling and lack of communication between tasks– Black cloud

• lack of people able to bridge the knowledge and experience gap between distinct communities

– Prima-donnas• Seemingly condescending and egotistical behaviour, irreceptiveness to

collaboration– Sharing villainy

• Lack of knowledge exchange incentives– Organisational skirmish

• Misalignment of organisational cultures between distinct communities – …

Page 33: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Interdisciplinary research

“Many challenges we face are not solvable by people remaining in their single discipline silos”…

www.newscientist.com/article/mg20928002-100-open-your-mind-to-interdisciplinary-research/

Page 34: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Interdisciplinary research

“bringing […] disciplines together in the long term is what provides the big, big breakthroughs”

Page 35: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Interdisciplinary researchSocial Network Analysis (SNA)

Page 36: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Social Network Analysis

Social network centrality measuresDegree

Number of in- or outgoing dependencies of a node.

BetweennessQuantifies number of times a node acts as a bridge along the shortest path between two other nodes.

ClosenessThe more central a node, the lower its total distance from all other nodes.

Eigenvector centrality and PageRankMeasures the influence of a node in a network.

Page 37: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Social Network Analysis

Social network centrality measures

Page 38: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Social Network Analysis

Can be used to– detect social debt– identify social bus factor– predict software failures– … and many more …

Page 39: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Social Network Analysis

Social bus factor in Gentoo Linux– All bug handling became concentrated in one contributor– Measured by significant increase of centralization and

performance.

Zanetti et al. “The rise and fall of a central contributor: Dynamics of social organization and performance in the Gentoo community.” CHASE 2013

Page 40: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Social Network Analysis

Social bus factor in Gentoo Linux– Contributor suddenly left the project, being

dissatisfied– Sentiment analysis showed correlation with negative

emotions– Lasting negative impact on the bug handling

performance of the community.

Zanetti et al. “The rise and fall of a central contributor: Dynamics of social organization and performance in the Gentoo community.” CHASE 2013

Page 41: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Use of SNA to better predict software failures– By combining program dependency information

with social network information

Social Network Analysis

Bird et al. “Putting it All Together: Using Socio-Technical Networks to Predict Failures.” ISSRE 2009

Pinzger et al. “Can developer-module networks predict failures?”FSE 2008

Page 42: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Mirroring hypothesis

Conway’s lawSoftware structure tends to mirror the organisational/social structure

A.k.a. socio-technical congruencealignment between technical dependencies and social coordination in a project

Page 43: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Mirroring hypothesis

Conway’s law

• Evidence in favor: commercial “in-house” development

• Evidence against: “community-based” development

More modular software=> emergent “complex network” structure?

MacCormack et al. “Exploring the duality between product and organizational architectures: A test of the mirroring hypothesis.” Research Policy, 2012.

Colfer et al. “The mirroring hypothesis: Theory, evidence and exceptions.” Harvard Business School, 2010.

Page 44: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Interdisciplinary researchComplex Systems

Page 45: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Interdisciplinary researchComplexity Theory

Page 46: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Interdisciplinary researchComplex Systems

“A new approach to science that investigates how relationships between parts give rise to the collective behaviors of a system and how the system interacts

and forms relationships with its environment.”

Emergence: process whereby larger entities, patterns, and regularities arise through interactions among smaller or simpler entities that themselves do not exhibit such properties.

Page 47: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Complexity TheoryNetwork Theory

Citation from Mitchell’s book:

“network thinking is providing novel ways to think about difficult problems such as how to do efficient search on the Web, […] how to manage large organisations, how to preserve ecosystems, […] and, more generally, what kind of resilience and vulnerabilities are intrinsic to natural, social, and technological networks, and how to exploit and protect such systems.”

Page 48: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Complexity TheoryNetwork Theory

Some characteristics of complex networks:

Small-world property• Low average path length between any two nodes• Highly-clustered components linked through hubs

Skewed distributions (power law behaviour)• Few nodes with very high in-degree (resp. out-degree),

many nodes with very small in-degree (resp. out)

Page 49: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Complexity TheoryNetwork Theory

Some characteristics of complex networks:

Scale-freeness• Observed degree distribution is very similar

regardless of the scale of the observation

Scale-free networks are resilient• Robust to deletion of random (non-hub) nodes• vulnerable to the deletion of hubs

Page 50: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Complexity TheoryNetwork Theory

Examples of complex networks exhibiting these characteristics

– World-Wide Web– (Technical) software dependency graphs– Social networks (e.g. Facebook)– (Socio-technical) software ecosystems

Page 51: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Complexity TheoryNetwork Theory

Examples of softwaresystem dependencynetworks

Page 52: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Network TheoryPossible applications for SECOs• Provide prediction/forecasting models

– of how SECOs emerge– of how SECOs grow/evolve

• Estimate the resilience and sustainability of SECOs after major disturbances

• Assess risk of deleting hub nodes bus factor!

Page 53: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Network TheoryPossible applications for SECOsHow do SECOs emerge and grow?

A popular model is preferential attachmentOver time, nodes with higher degree receive more links than nodes with lower degree.

Extensions of this model have been proposed to simulate the growth of complex software systems

By mimicking the principle of coupling & cohesion

Barabasi et al. Emergence of Scaling in Random Networks. Science 286, 1999

Li et al. Multi-Level Formation of Complex Software Systems. Entropy 18(178), 2016

Page 54: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Network TheoryPossible applications for SECOs

Page 55: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Interdisciplinary researchEcology and natural ecosystems

Page 56: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Ecology and natural ecosystems

Biodiversity of species E.g. hosts – parasites / plants – pollinators

58

Mutual dependency and functional redundancy

Disappearing species may be compensated by others if there is sufficient diversity in both layers.

Page 57: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Ecology and natural ecosystems

Diversity metrics• species richness = number of different species in the ecosystem• species evenness (entropy) = relative abundance of the

population of each species in the ecosystem• Shannon diversity index (relative entropy) = specialisation of a

given species in relation to the species in the other level• Simpson index = degree of concentration when individuals are

classified into species

59

Page 58: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Software Ecosystems

Diversity in software ecosystems

62

Mutual dependency and functional redundancy

Disappearance of projects or contributors may be

compensated by others.

Page 59: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Software EcosystemsDiversity

Are software project teams diverse?– In terms of code ownership, types of activity,

gender balance, seniority, …How does this diversity affect …

– defect-proneness?– productivity?– …

Page 60: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Software EcosystemsDiversity

Success story of diversity measures:Assess defect-proneness in software projects

• More focused developers introduce fewer defects. • Modules receiving narrowly focused activity

are more likely to contain defects.

Posnett et al. Dual Ecological Measures of Focus in Software development.ICSE 2013

Page 61: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Software EcosystemsGender Diversity

Effect of gender diversity on productivity?Women underrepresented in programming

– industry: 16-18% female developers– open source: ~10%– social coding platforms:

• GitHub: ~9%• StackOverflow: ~7%

Vasilescu et al. Gender and tenure diversity in GitHub teams. CHI 2015A Data Set for Social Diversity Studies of GitHub Teams (MSR’15)

Page 62: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Software EcosystemsGender Diversity

Success story of diversity measures:– Gender and tenure diversity are positive and

significant predictors of productivity– Teams that are more balanced in terms of gender

and seniority have higher productivity rates

Vasilescu et al. Gender and tenure diversity in GitHub teams. CHI 2015

Page 63: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Interdisciplinary researchSurvival Analysis

Statistical technique used in many disciplines to analyze the time until the occurrence of an event of interest• Medicine

– Effect of treatment or medicine to cure disease– Effect of disease on patient mortality

• Sociology– Factors influencing marriage or divorce

Page 64: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Interdisciplinary researchSurvival Analysis

Page 65: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Interdisciplinary researchSurvival Analysis

Success story:OSS project survival

Factors positivelyinfluencing survival:

#contributorsProject age

Basis for predictionmodels

Samoladas et al. Survival analysis on the duration of open source projects. IST 2010

Page 66: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

SECO Research Challenges continued…

Understanding SECOs• How do different SECOs compare?• How to face technical challenges?

– Big data– Privacy versus reproducibility

Serebrenik et al. “Challenges in Software Ecosystems Research.” IWSECO-WEA 2015

Page 67: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Research ChallengeComparing SECOs

• Each software ecosystem– has specific habits, expectations, change policies– uses specific tools

• Taking into account these differences is important– to support SECO maintenance and evolution– to generalise research findings across SECOs

Bogart et al. “How to break an API: Cost negotiation and community values in three software ecosystems.” FSE 2016

Decan et al. “On the topology of package dependency networks – A comparison of three programming language ecosystems.” WEA 2016

Page 68: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Research ChallengeBig Data

Volume Velocity

Variety Veracity

4V

Page 69: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Research Challenge

Privacy Reproducibility

Page 70: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Research ChallengePrivacy vs reproducibility

How to preserve privacy of individuals?– EU 2016/679 regulation on the protection of natural

persons with regard to the processing of personal data and on the free movement of such data

“The principles of data protection should apply to any information concerning an identified or identifiable natural person. “

– Appropriate anonimisation and privacy-preserving data mining techniques needed

Fung et al. Privacy-preserving data publishing: A survey of recent developments. ACM Computing Surveys 2010

Malik et al. Privacy preserving data mining techniques: Current scenario and future prospects. IC3T 2012

Page 71: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Research ChallengePrivacy vs reproducibility

• Increase/ensure reproducible research results– Awareness is increasing– Solutions are being put into place– Big data problems remain an issue

• How to reconcile privacy with reproducibility?

Gonzalez-Barahona et al. On the reproducibility of empirical software engineering studies based on data retrieved from development repositories. Emp. Softw. Eng. 2012

Page 72: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution

Wrap-up

Research on SECO evolution requires– A socio-technical view– Mixed method research– Interdisciplinary research

Many technical challenges need to be faced

Are you willing to take up the challenge?

Page 73: ICSME 2016 keynote: An ecosystemic and socio-technical view on software maintenance and evolution