Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey...

33
Agent-Based Modeling and Agent-Based Modeling and Simulation of Collaborative Simulation of Collaborative Social Networks Social Networks Research in Progress Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University of Notre Dame Vincent Freeh Computer Science North Carolina State University Renee Tynan Chris Hoffman Department of Management University of Notre Dame Supported in part by the Supported in part by the National Science Foundation - Digital Society & Technology Program National Science Foundation - Digital Society & Technology Program AMCIS2003 Tampa, FL August 2003
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    1

Transcript of Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey...

Page 1: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

Agent-Based Modeling and Agent-Based Modeling and Simulation of Collaborative Simulation of Collaborative

Social NetworksSocial NetworksResearch in ProgressResearch in Progress

Greg Madey

Yongqin GaoComputer Science &

EngineeringUniversity of Notre Dame

Vincent FreehComputer Science

North Carolina State University

Renee Tynan

Chris HoffmanDepartment of Management

University of Notre Dame

Supported in part by the Supported in part by the National Science Foundation - Digital Society & Technology ProgramNational Science Foundation - Digital Society & Technology Program

AMCIS2003

Tampa, FL

August 2003

Page 2: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

OutlineOutline

• Definitions: Agents, models, simulations, Definitions: Agents, models, simulations, collaborative social networks, computer collaborative social networks, computer experimentsexperiments

• Phenomenon: Free/Open Source Software (F/OSS)Phenomenon: Free/Open Source Software (F/OSS)• Conceptual modelsConceptual models

– ER modelER model– BA modelBA model– BA model with constant fitnessBA model with constant fitness– BA model with dynamic fitnessBA model with dynamic fitness

• Experiments and resultsExperiments and results• SummarySummary• Some discussion questionsSome discussion questions

Page 3: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

Agent-Based Modeling and SimAgent-Based Modeling and Simulationulation

• Conceptual models of a phenomenonConceptual models of a phenomenon• Simulations are computer implementations of Simulations are computer implementations of

the conceptual modelsthe conceptual models• Agents in models and simulations are distinct Agents in models and simulations are distinct

entities (instantiated objects)entities (instantiated objects)– Tend to be simple, but with large numbers of them Tend to be simple, but with large numbers of them

(thousands, or more) - i.e., swarm intelligence(thousands, or more) - i.e., swarm intelligence– Contrasted with higher level “intelligent agents”Contrasted with higher level “intelligent agents”

• Foundations in complexity theoryFoundations in complexity theory– Self-organizationSelf-organization– EmergenceEmergence

Page 4: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

Collaborative Social NetwCollaborative Social Networksorks

• Research-paper co-authorship, small world phenomenon, e.g., Research-paper co-authorship, small world phenomenon, e.g., Erdos number Erdos number (Barabasi 2001, Newman 2001)(Barabasi 2001, Newman 2001)

• Movie actors, small world phenomenon, e.g., Kevin Bacon number Movie actors, small world phenomenon, e.g., Kevin Bacon number (Watts 1999, 2003)(Watts 1999, 2003)

• Interlocking corporate directorshipsInterlocking corporate directorships• Open-source software developers Open-source software developers (Madey et al, AMCIS 2002)(Madey et al, AMCIS 2002)

• Collaborators are nodes in a graph, and collaborative relationship Collaborators are nodes in a graph, and collaborative relationship are the edges of the graphare the edges of the graph

Page 5: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

Classical Scientific Classical Scientific MethodMethod

1.1. Observe the worldObserve the worlda)a) Identify a puzzling phenomenonIdentify a puzzling phenomenon

2.2. Generate a falsifiable hypothesis Generate a falsifiable hypothesis (K. Popper)(K. Popper)

3.3. Design and conduct an experiment with Design and conduct an experiment with the goal of disproving the hypothesisthe goal of disproving the hypothesisa)a) If the experiment “fails”, then the hypothesis If the experiment “fails”, then the hypothesis

is accepted (until replaced)is accepted (until replaced)b)b) If the experiment “succeeds”, then reject If the experiment “succeeds”, then reject

hypothesis, but additional insight into the hypothesis, but additional insight into the phenomenon may be obtained and steps 2-3 phenomenon may be obtained and steps 2-3 repeatedrepeated

Page 6: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

The Computer The Computer ExperimentExperiment

Page 7: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

Agent-Based Simulation as Agent-Based Simulation as a Component of the a Component of the

Scientific MethodScientific MethodModeling(Hypothesis)

Agent -BasedSimulation(Experiment)

Observation

Page 8: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

Agent-Based Simulation as Agent-Based Simulation as a Component of the a Component of the

Scientific MethodScientific MethodModeling(Hypothesis)

Agent -BasedSimulation(Experiment)

Observation

Social NetworkModel of F/OSS

Grow ArtificialSourceForge

Analysis ofSourceForge

Data

Page 9: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

Open Source Software (OSS)Open Source Software (OSS)• Free …Free …

– to view sourceto view source– to modifyto modify– to shareto share– of costof cost

• ExamplesExamples– ApacheApache– PerlPerl– GNUGNU– LinuxLinux– SendmailSendmail– PythonPython– KDEKDE– GNOMEGNOME– MozillaMozilla– Thousands moreThousands more

LinuxGNU

Savannah

Page 10: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

Free Open Source Software Free Open Source Software (F/OSS)(F/OSS)

• DevelopmentDevelopment– Mostly volunteerMostly volunteer– Global teamsGlobal teams– Virtual teamsVirtual teams– Self-organized - often peer-based meritocracy Self-organized - often peer-based meritocracy – Self-managed - but often a “charismatic” Self-managed - but often a “charismatic”

leaderleader– Often large numbers of developers, testers, Often large numbers of developers, testers,

support help, end user participationsupport help, end user participation– Rapid, frequent releasesRapid, frequent releases– Mostly unpaidMostly unpaid

Page 11: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

F/OSSF/OSSDeveloperDeveloper

ss

Linus TolvaldsLinux

Larry WallPerl

Richard StallmanGNU

GNU ManifestoEric RaymondCathedral and Bazaar

Page 12: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

F/OSS: A F/OSS: A Puzzling Puzzling PhenomenonPhenomenon

• Contradicts traditional Contradicts traditional wisdom:wisdom:– Software engineeringSoftware engineering– Coordination, large numbersCoordination, large numbers– Motivation of developersMotivation of developers– QualityQuality– SecuritySecurity– Business strategyBusiness strategy

• Almost everything is done Almost everything is done electronically and electronically and available in digital form available in digital form

• Opportunity for IS Opportunity for IS Research -- large amounts Research -- large amounts of online data availableof online data available

• Research issues:Research issues:– Understanding motivesUnderstanding motives– Understanding Understanding

processesprocesses– Intellectual propertyIntellectual property– Digital divideDigital divide– Self-organizationSelf-organization– Government policyGovernment policy– Impact on innovationImpact on innovation– EthicsEthics– Economic modelsEconomic models– Cultural issuesCultural issues– International factorsInternational factors

Page 13: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

SourceForgeSourceForge

• VA Software• Part of OSDN• Started 12/1999• Collaboration tools• 58,685 Projects• 80,000 Developers• 590,00 Registered Users

Page 14: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

SavannahSavannah• Uses SourceForge Software • Free Software Foundation•1,508 Projects•15,265 Registered Users

Page 15: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

F/OSS: Importance

Major Component of e-Technology Infrastructure with major presence in

e-Commercee-Sciencee-Governmente-Learning

Apache has over 65% market share of Internet Web serversLinux on over 7 million computersMost Internet e-mail runs on SendmailTens of thousands of quality productsPart of product offerings of companies like IBM, Apple

Apache in WebSphere, Linux on mainframe, FreeBSD in OSXCorporate employees participating on OSS projects

Page 16: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

Free/Open Source Free/Open Source SoftwareSoftware

• Seems to challenge traditional economic Seems to challenge traditional economic assumptionsassumptions

• Model for software engineeringModel for software engineering• New business strategiesNew business strategies

– Cooperation with competitorsCooperation with competitors– Beyond trade associations, shared industry research, Beyond trade associations, shared industry research,

and standards processes — shared product and standards processes — shared product development!development!

• Virtual, self-organizing and self-managing teamsVirtual, self-organizing and self-managing teams• Social issues, e.g., digital divide, international Social issues, e.g., digital divide, international

participationparticipation• Government policy issues, e.g., US software Government policy issues, e.g., US software

industry, impact on innovation, security, industry, impact on innovation, security, intellectual property intellectual property

Page 17: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

ObservationsObservations

• Web miningWeb mining• Web crawler (scripts)Web crawler (scripts)

– PythonPython– PerlPerl– AWKAWK– SedSed

• MonthlyMonthly• Since Jan 2001 Since Jan 2001 • ProjectIDProjectID• DeveloperIDDeveloperID• Almost 2 million recordsAlmost 2 million records• Relational databaseRelational database

PROJ|DEVELOPER8001|dev3788001|dev89758001|dev99728002|dev276508005|dev313518006|dev125098007|dev193958007|dev46228007|dev356118008|dev8975

Page 18: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

Models of the F/OSS Social Models of the F/OSS Social NetworkNetwork

(Alternative Hypotheses)(Alternative Hypotheses)• General model featuresGeneral model features– Agents are nodes on a graph (developers or projects) Agents are nodes on a graph (developers or projects) – Behaviors: Create, join, abandon and idleBehaviors: Create, join, abandon and idle– Edges are relationships (joint project participation)Edges are relationships (joint project participation)– Growth of network: random or types of preferential Growth of network: random or types of preferential

attachment, formation of clustersattachment, formation of clusters– FitnessFitness – Network attributes: diameter, average degree, Network attributes: diameter, average degree,

degree distribution, clustering coefficientdegree distribution, clustering coefficient• Four specific modelsFour specific models

– ER (random graph) - (1960)ER (random graph) - (1960)– BA (preferential attachment) - (1999)BA (preferential attachment) - (1999)– BA ( + constant fitness) - (2001)BA ( + constant fitness) - (2001)– BA ( + dynamic fitness) - (2003)BA ( + dynamic fitness) - (2003)

Page 19: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

15850 dev[46]dev[83] 15850 dev[46]

dev[48]

15850 dev[46]dev[56]

15850 dev[46]dev[58]

6882 dev[58]dev[47]

6882 dev[47]dev[79]

6882 dev[47]dev[52]

6882 dev[47]dev[55]

7028 dev[46]dev[99]

7028 dev[46]dev[51]

7028 dev[46]dev[57]

7597 dev[46]dev[45]

7597 dev[46]dev[72]

7597 dev[46]dev[55]

7597 dev[46]dev[58]

7597 dev[46]dev[61]

7597 dev[46]dev[64]7597 dev[46]

dev[67]

7597 dev[46]dev[70]

9859 dev[46]dev[49]9859 dev[46]

dev[53]

9859 dev[46]dev[54]

9859 dev[46]dev[59]

dev[46]

dev[83] dev[56]

dev[48]

dev[52]

dev[79]

dev[72]

dev[51]

dev[57]

dev[55]

dev[99]

dev[47]

dev[58]

dev[53]

dev[58]

dev[65]

dev[45]

dev[70]

dev[67]

dev[59]

dev[54]

dev[49]

dev[64]

dev[61]

Project 6882

Project 9859

Project 7597

Project 7028

Project 15850

F/OSS Developers - Collaboration Social NetworkDevelopers are nodes / Projects are links

24 Developers5 Projects

2 Linchpin Developers1 Cluster

Page 20: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

Computer ExperimentsComputer Experiments

• Agent-based simulationsAgent-based simulations• Java programs using Swarm class libraryJava programs using Swarm class library

– Validation (docking) exercises using Java/RepastValidation (docking) exercises using Java/Repast

• Grow artificial SourceForge’s Grow artificial SourceForge’s (Epstein & Axtell, (Epstein & Axtell, 1996)1996)

– Parameterized with observed data, e.g., developer Parameterized with observed data, e.g., developer behaviorsbehaviors• Join ratesJoin rates• New project additionsNew project additions• Leave projectsLeave projects

– Evaluation of four models (hypotheses)Evaluation of four models (hypotheses)– Verification/validation Verification/validation

Page 21: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

Four Cycles of Modeling & Four Cycles of Modeling & SimulationSimulation

Modeling(Hypothesis)

Agent -BasedSimulation(Experiment)

Observation

Social Network ModelsER => BA => BA+Fitness => BA+Dynamic Fitness

Grow ArtificialSourceForge

Analysis ofSourceForge

Data

Degree DistributionAverage Degree

DiameterClustering Coefficient

Cluster Size Distribution

Page 22: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

ER model – degree ER model – degree distributiondistribution

• Degree distribution is binomial distribution while it is power law in empirical data

• Fit fails

Page 23: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

ER model - diameterER model - diameter

• Average degree is Average degree is decreasing while it decreasing while it is increasing in is increasing in empirical dataempirical data

• Diameter is Diameter is increasing while it increasing while it is decreasing in is decreasing in empirical dataempirical data

• Fit fails

Page 24: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

ER model – clustering ER model – clustering coefficientcoefficient

• Clustering Clustering coefficient is coefficient is relatively low around relatively low around 0.4 while it is around 0.4 while it is around 0.7 in empirical data.0.7 in empirical data.

• Clustering Clustering coefficient is coefficient is decreasing while it is decreasing while it is increasing in increasing in empirical dataempirical data

• Fit fails

Page 25: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

ER model – cluster ER model – cluster distributiondistribution

• Cluster distribution in ER Cluster distribution in ER model also have power model also have power law distribution with Rlaw distribution with R22 as as 0.6667 (0.9953 without 0.6667 (0.9953 without the major cluster) while Rthe major cluster) while R22 in empirical data is 0.7457 in empirical data is 0.7457 (0.9797 without the major (0.9797 without the major cluster)cluster)

• The actual distribution is The actual distribution is different from empirical different from empirical datadata

• The later models (BA and The later models (BA and further models) have further models) have similar behaviorssimilar behaviors

• Fit fails

Page 26: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

BA model – degree BA model – degree distributiondistribution

• Power laws in degree Power laws in degree distribution, similar to distribution, similar to empirical data (+ for empirical data (+ for simulated data and x for simulated data and x for empirical data).empirical data).

• For developer For developer distribution: simulated distribution: simulated data has Rdata has R22 of 0.9798 of 0.9798 and empirical data has and empirical data has RR22 of 0.9712. of 0.9712.– Fit succeeds

• For project distribution: For project distribution: simulated data has Rsimulated data has R22 of of 0.6650 and empirical 0.6650 and empirical data has Rdata has R22 of 0.9815. of 0.9815.– Fit fails

Page 27: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

BA model – diameter and BA model – diameter and CCCC

• Small diameter Small diameter and high clustering and high clustering coefficient like coefficient like empirical dataempirical data

• Diameter and Diameter and clustering clustering coefficient are both coefficient are both decreasing like decreasing like empirical dataempirical data

• Fit succeeds

Page 28: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

BA model with constant BA model with constant fitnessfitness

• Power laws in degree Power laws in degree distribution, similar to distribution, similar to empirical data (+ for simulated empirical data (+ for simulated data and x for empirical data).data and x for empirical data).

• For developer distribution: For developer distribution: simulated data has Rsimulated data has R22 as as 0.9742 and empirical data has 0.9742 and empirical data has RR22 as 0.9712. as 0.9712.– Fit succeeds

• For project distribution: For project distribution: simulated data has Rsimulated data has R22 as as 0.7253 and empirical data has 0.7253 and empirical data has RR22 as 0.9815. as 0.9815.– Fit fails

• Diameter and CC are similar to Diameter and CC are similar to simple BA model.simple BA model.– Fit succeeds

Page 29: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

Discovery: BA with dynamic Discovery: BA with dynamic fitnessfitness

• Problem with BA with constant fitnessProblem with BA with constant fitness– Intuition: Project fitness might change with Intuition: Project fitness might change with

time.time.

• Data mining observation: project “life Data mining observation: project “life cycle” property - fitness generally cycle” property - fitness generally decreases with timedecreases with time

• New model not in the literatureNew model not in the literature– Hypothesis: BA with dynamic fitness of Hypothesis: BA with dynamic fitness of

projectsprojects– Computer experimentComputer experiment

Page 30: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

BA model with dynamic BA model with dynamic fitnessfitness

• Power laws in degree Power laws in degree distribution, similar to distribution, similar to empirical data (+ for empirical data (+ for simulated data and x for simulated data and x for empirical data).empirical data).

• For developer For developer distribution: simulated distribution: simulated data has Rdata has R22 as 0.9695 and as 0.9695 and empirical data has Rempirical data has R22 as as 0.9712.0.9712.– Fit succeeds (as before)

• For project distribution: For project distribution: simulated data has Rsimulated data has R22 as as 0.8051 and empirical data 0.8051 and empirical data has Rhas R22 as 0.9815. as 0.9815.– Fit is better, but more

work needed

Page 31: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

SummarySummary

• Why Agent-Based Modeling and Simulation?Why Agent-Based Modeling and Simulation?– Can be used as components of the Scientific MethodCan be used as components of the Scientific Method– A research approach for studying socio-technical syA research approach for studying socio-technical sy

stemsstems• Case study: F/OSS - Collaboration Social NetworCase study: F/OSS - Collaboration Social Networ

ksks– SourceForge conceptual models: ER, BA, BA with coSourceForge conceptual models: ER, BA, BA with co

nstant fitness and BA with dynamic fitness.nstant fitness and BA with dynamic fitness.– Simulations Simulations

• Computer experiments that tested conceptual modelsComputer experiments that tested conceptual models• Provided insight into the phenomenon under study and gProvided insight into the phenomenon under study and g

uided data mining of collected observationsuided data mining of collected observations

Page 32: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

DiscussionDiscussion

• ““The social sciences are, in fact, the ‘hard’ The social sciences are, in fact, the ‘hard’ sciences”,sciences”, Herbert Simon (1987)

• Computational social science: agent-based modeling and simulation

• Kuhn’s periods of “Normal Science” punctuated by “Paradigm shifts”

• Karl Popper’s “theory-testing through falsification””

• Relevant literature on the role of simulation in the process of scientific discovery

Page 33: Agent-Based Modeling and Simulation of Collaborative Social Networks Research in Progress Greg Madey Yongqin Gao Computer Science & Engineering University.

Thank youThank you