Statistical Models for Social Networks with Application to HIV...
Transcript of Statistical Models for Social Networks with Application to HIV...
![Page 1: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/1.jpg)
Statistical Models for Social Networkswith Application to HIV Epidemiology
Mark S. Handcock
Department of StatisticsUniversity of Washington
Joint work with
Pavel KrivitskyMartina Morris
and the
U. Washington Network Modeling GroupSupported by NIH NIDA Grant DA012831 and NICHD Grant HD041877
NIPS 2007, December 4 2007
![Page 2: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/2.jpg)
Network modeling from a statistical perspective
Networks are widely used to represent data on relations betweeninteracting actors or nodes.
The study of social networks is multi-disciplinary
plethora of terminologiesvaried objectives, multitude of frameworks
Understanding the structure of social relations has beenthe focus of the social sciences
social structure: a system of social relations tying distinct socialentities to one anotherInterest in understanding how social structure form and evolve
Attempt to represent the structure in social relations via networks
the data is conceptualized as a realization of a network model
The data are of at least three forms:
individual-level information on the social entitiesrelational data on pairs of entitiespopulation-level data
![Page 3: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/3.jpg)
Network modeling from a statistical perspective
Networks are widely used to represent data on relations betweeninteracting actors or nodes.
The study of social networks is multi-disciplinary
plethora of terminologiesvaried objectives, multitude of frameworks
Understanding the structure of social relations has beenthe focus of the social sciences
social structure: a system of social relations tying distinct socialentities to one anotherInterest in understanding how social structure form and evolve
Attempt to represent the structure in social relations via networks
the data is conceptualized as a realization of a network model
The data are of at least three forms:
individual-level information on the social entitiesrelational data on pairs of entitiespopulation-level data
![Page 4: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/4.jpg)
Network modeling from a statistical perspective
Networks are widely used to represent data on relations betweeninteracting actors or nodes.
The study of social networks is multi-disciplinary
plethora of terminologiesvaried objectives, multitude of frameworks
Understanding the structure of social relations has beenthe focus of the social sciences
social structure: a system of social relations tying distinct socialentities to one anotherInterest in understanding how social structure form and evolve
Attempt to represent the structure in social relations via networks
the data is conceptualized as a realization of a network model
The data are of at least three forms:
individual-level information on the social entitiesrelational data on pairs of entitiespopulation-level data
![Page 5: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/5.jpg)
Network modeling from a statistical perspective
Networks are widely used to represent data on relations betweeninteracting actors or nodes.
The study of social networks is multi-disciplinary
plethora of terminologiesvaried objectives, multitude of frameworks
Understanding the structure of social relations has beenthe focus of the social sciences
social structure: a system of social relations tying distinct socialentities to one anotherInterest in understanding how social structure form and evolve
Attempt to represent the structure in social relations via networks
the data is conceptualized as a realization of a network model
The data are of at least three forms:
individual-level information on the social entitiesrelational data on pairs of entitiespopulation-level data
![Page 6: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/6.jpg)
Network modeling from a statistical perspective
Networks are widely used to represent data on relations betweeninteracting actors or nodes.
The study of social networks is multi-disciplinary
plethora of terminologiesvaried objectives, multitude of frameworks
Understanding the structure of social relations has beenthe focus of the social sciences
social structure: a system of social relations tying distinct socialentities to one anotherInterest in understanding how social structure form and evolve
Attempt to represent the structure in social relations via networks
the data is conceptualized as a realization of a network model
The data are of at least three forms:
individual-level information on the social entitiesrelational data on pairs of entitiespopulation-level data
![Page 7: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/7.jpg)
Network modeling from a statistical perspective
Networks are widely used to represent data on relations betweeninteracting actors or nodes.
The study of social networks is multi-disciplinary
plethora of terminologiesvaried objectives, multitude of frameworks
Understanding the structure of social relations has beenthe focus of the social sciences
social structure: a system of social relations tying distinct socialentities to one anotherInterest in understanding how social structure form and evolve
Attempt to represent the structure in social relations via networks
the data is conceptualized as a realization of a network model
The data are of at least three forms:
individual-level information on the social entitiesrelational data on pairs of entitiespopulation-level data
![Page 8: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/8.jpg)
Deep literatures available
Social networks community (Heider 1946; Frank 1972; Holland and Leinhardt 1981)
Statistical Networks Community (Frank and Strauss 1986; Snijders 1997)
Spatial Statistics Community (Besag 1974)
Statistical Exponential Family Theory (Barndorff-Nielsen 1978)
Graphical Modeling Community (Lauritzen and Spiegelhalter 1988, . . . )
Machine Learning Community (Jordan, Jensen, Xing, .... . . )
Physics and Applied Math (Newman, Watts, . . . )
![Page 9: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/9.jpg)
Deep literatures available
Social networks community (Heider 1946; Frank 1972; Holland and Leinhardt 1981)
Statistical Networks Community (Frank and Strauss 1986; Snijders 1997)
Spatial Statistics Community (Besag 1974)
Statistical Exponential Family Theory (Barndorff-Nielsen 1978)
Graphical Modeling Community (Lauritzen and Spiegelhalter 1988, . . . )
Machine Learning Community (Jordan, Jensen, Xing, .... . . )
Physics and Applied Math (Newman, Watts, . . . )
![Page 10: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/10.jpg)
Deep literatures available
Social networks community (Heider 1946; Frank 1972; Holland and Leinhardt 1981)
Statistical Networks Community (Frank and Strauss 1986; Snijders 1997)
Spatial Statistics Community (Besag 1974)
Statistical Exponential Family Theory (Barndorff-Nielsen 1978)
Graphical Modeling Community (Lauritzen and Spiegelhalter 1988, . . . )
Machine Learning Community (Jordan, Jensen, Xing, .... . . )
Physics and Applied Math (Newman, Watts, . . . )
![Page 11: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/11.jpg)
Deep literatures available
Social networks community (Heider 1946; Frank 1972; Holland and Leinhardt 1981)
Statistical Networks Community (Frank and Strauss 1986; Snijders 1997)
Spatial Statistics Community (Besag 1974)
Statistical Exponential Family Theory (Barndorff-Nielsen 1978)
Graphical Modeling Community (Lauritzen and Spiegelhalter 1988, . . . )
Machine Learning Community (Jordan, Jensen, Xing, .... . . )
Physics and Applied Math (Newman, Watts, . . . )
![Page 12: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/12.jpg)
Deep literatures available
Social networks community (Heider 1946; Frank 1972; Holland and Leinhardt 1981)
Statistical Networks Community (Frank and Strauss 1986; Snijders 1997)
Spatial Statistics Community (Besag 1974)
Statistical Exponential Family Theory (Barndorff-Nielsen 1978)
Graphical Modeling Community (Lauritzen and Spiegelhalter 1988, . . . )
Machine Learning Community (Jordan, Jensen, Xing, .... . . )
Physics and Applied Math (Newman, Watts, . . . )
![Page 13: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/13.jpg)
Deep literatures available
Social networks community (Heider 1946; Frank 1972; Holland and Leinhardt 1981)
Statistical Networks Community (Frank and Strauss 1986; Snijders 1997)
Spatial Statistics Community (Besag 1974)
Statistical Exponential Family Theory (Barndorff-Nielsen 1978)
Graphical Modeling Community (Lauritzen and Spiegelhalter 1988, . . . )
Machine Learning Community (Jordan, Jensen, Xing, .... . . )
Physics and Applied Math (Newman, Watts, . . . )
![Page 14: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/14.jpg)
Deep literatures available
Social networks community (Heider 1946; Frank 1972; Holland and Leinhardt 1981)
Statistical Networks Community (Frank and Strauss 1986; Snijders 1997)
Spatial Statistics Community (Besag 1974)
Statistical Exponential Family Theory (Barndorff-Nielsen 1978)
Graphical Modeling Community (Lauritzen and Spiegelhalter 1988, . . . )
Machine Learning Community (Jordan, Jensen, Xing, .... . . )
Physics and Applied Math (Newman, Watts, . . . )
![Page 15: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/15.jpg)
Example of Social Relationships between MonksExpressed “liking” between 18 monks within an isolated monastery⇒ Sampson (1969)
A directed relationship aggregated over a 12 month period before thebreakup of the cloister.
!! " ! #
!!
"!
#
![Page 16: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/16.jpg)
Example of Social Relationships between MonksExpressed “liking” between 18 monks within an isolated monastery⇒ Sampson (1969)
A directed relationship aggregated over a 12 month period before thebreakup of the cloister.
Sampson identified three groups plus:(T)urks, (L)oyal Opposition, (O)utcasts and (W)averers
!! " ! #
!!
"!
#
$
$
$
$
$
$
$
%
%
%
%
%
%
%
&
&&&
![Page 17: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/17.jpg)
Examples of Friendship Relationships
The National Longitudinal Study of Adolescent Health⇒ www.cpc.unc.edu/projects/addhealth
– “Add Health” is a school-based study of the health-relatedbehaviors of adolescents in grades 7 to 12.
Each nominated up to 5 boys and 5 girls as their friends
160 schools: Smallest has 69 adolescents in grades 7–12
![Page 18: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/18.jpg)
Examples of Friendship Relationships
The National Longitudinal Study of Adolescent Health⇒ www.cpc.unc.edu/projects/addhealth
– “Add Health” is a school-based study of the health-relatedbehaviors of adolescents in grades 7 to 12.
Each nominated up to 5 boys and 5 girls as their friends
160 schools: Smallest has 69 adolescents in grades 7–12
![Page 19: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/19.jpg)
−10 −5 0 5 10
−10
−5
05
10
12
7 9
10
9
8
10
11
7
8
11
8
10
8
8
10
97
8
8
11
8
99
7
11
9
10
8
11
7
9
11
11
11
10
10
9
9
7
10
10
7
7 9
9
1111
8
12
9
9
10
7
7
9
7
11
9
7
12
7
8
9
11
11
7
8
12
![Page 20: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/20.jpg)
White (non-Hispanic)Grade 7Black (non-Hispanic)Hispanic (of any race)Asian / Native Am / Other (non-Hispanic)Race NA
Grade 8Grade 9Grade 10Grade 11Grade 12Grade NA
![Page 21: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/21.jpg)
Features of Many Social Networks
Mutuality of ties
Individual heterogeneity in the propensity to form ties
Homophily by actor attributes⇒ Lazarsfeld and Merton, 1954; Freeman, 1996; McPherson et al., 2001
higher propensity to form ties between actors with similar attributese.g., age, gender, geography, major, social-economic statusattributes may be observed or unobserved
Transitivity of relationships
friends of friends have a higher propensity to be friends
Balance of relationships ⇒ Heider (1946)
people feel comfortable if they agree with others whom they like
Context is important ⇒ Simmel (1908)
triad, not the dyad, is the fundamental social unit
![Page 22: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/22.jpg)
Features of Many Social Networks
Mutuality of ties
Individual heterogeneity in the propensity to form ties
Homophily by actor attributes⇒ Lazarsfeld and Merton, 1954; Freeman, 1996; McPherson et al., 2001
higher propensity to form ties between actors with similar attributese.g., age, gender, geography, major, social-economic statusattributes may be observed or unobserved
Transitivity of relationships
friends of friends have a higher propensity to be friends
Balance of relationships ⇒ Heider (1946)
people feel comfortable if they agree with others whom they like
Context is important ⇒ Simmel (1908)
triad, not the dyad, is the fundamental social unit
![Page 23: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/23.jpg)
Features of Many Social Networks
Mutuality of ties
Individual heterogeneity in the propensity to form ties
Homophily by actor attributes⇒ Lazarsfeld and Merton, 1954; Freeman, 1996; McPherson et al., 2001
higher propensity to form ties between actors with similar attributese.g., age, gender, geography, major, social-economic statusattributes may be observed or unobserved
Transitivity of relationships
friends of friends have a higher propensity to be friends
Balance of relationships ⇒ Heider (1946)
people feel comfortable if they agree with others whom they like
Context is important ⇒ Simmel (1908)
triad, not the dyad, is the fundamental social unit
![Page 24: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/24.jpg)
Features of Many Social Networks
Mutuality of ties
Individual heterogeneity in the propensity to form ties
Homophily by actor attributes⇒ Lazarsfeld and Merton, 1954; Freeman, 1996; McPherson et al., 2001
higher propensity to form ties between actors with similar attributese.g., age, gender, geography, major, social-economic statusattributes may be observed or unobserved
Transitivity of relationships
friends of friends have a higher propensity to be friends
Balance of relationships ⇒ Heider (1946)
people feel comfortable if they agree with others whom they like
Context is important ⇒ Simmel (1908)
triad, not the dyad, is the fundamental social unit
![Page 25: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/25.jpg)
Features of Many Social Networks
Mutuality of ties
Individual heterogeneity in the propensity to form ties
Homophily by actor attributes⇒ Lazarsfeld and Merton, 1954; Freeman, 1996; McPherson et al., 2001
higher propensity to form ties between actors with similar attributese.g., age, gender, geography, major, social-economic statusattributes may be observed or unobserved
Transitivity of relationships
friends of friends have a higher propensity to be friends
Balance of relationships ⇒ Heider (1946)
people feel comfortable if they agree with others whom they like
Context is important ⇒ Simmel (1908)
triad, not the dyad, is the fundamental social unit
![Page 26: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/26.jpg)
Features of Many Social Networks
Mutuality of ties
Individual heterogeneity in the propensity to form ties
Homophily by actor attributes⇒ Lazarsfeld and Merton, 1954; Freeman, 1996; McPherson et al., 2001
higher propensity to form ties between actors with similar attributese.g., age, gender, geography, major, social-economic statusattributes may be observed or unobserved
Transitivity of relationships
friends of friends have a higher propensity to be friends
Balance of relationships ⇒ Heider (1946)
people feel comfortable if they agree with others whom they like
Context is important ⇒ Simmel (1908)
triad, not the dyad, is the fundamental social unit
![Page 27: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/27.jpg)
Features of Many Social Networks
Mutuality of ties
Individual heterogeneity in the propensity to form ties
Homophily by actor attributes⇒ Lazarsfeld and Merton, 1954; Freeman, 1996; McPherson et al., 2001
higher propensity to form ties between actors with similar attributese.g., age, gender, geography, major, social-economic statusattributes may be observed or unobserved
Transitivity of relationships
friends of friends have a higher propensity to be friends
Balance of relationships ⇒ Heider (1946)
people feel comfortable if they agree with others whom they like
Context is important ⇒ Simmel (1908)
triad, not the dyad, is the fundamental social unit
![Page 28: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/28.jpg)
The Choice of Models depends on the objectives
Primary interest in the nature of relationships:
– How the behavior of individuals depends on theirlocation in the social network
– How the qualities of the individuals influence thesocial structure
Secondary interest is in how network structure influencesprocesses that develop over a network
– spread of HIV and other STDs– diffusion of technical innovations– spread of computer viruses
Tertiary interest in the effect of interventions onnetwork structure and processes that develop over a network
![Page 29: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/29.jpg)
The Choice of Models depends on the objectives
Primary interest in the nature of relationships:
– How the behavior of individuals depends on theirlocation in the social network
– How the qualities of the individuals influence thesocial structure
Secondary interest is in how network structure influencesprocesses that develop over a network
– spread of HIV and other STDs– diffusion of technical innovations– spread of computer viruses
Tertiary interest in the effect of interventions onnetwork structure and processes that develop over a network
![Page 30: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/30.jpg)
The Choice of Models depends on the objectives
Primary interest in the nature of relationships:
– How the behavior of individuals depends on theirlocation in the social network
– How the qualities of the individuals influence thesocial structure
Secondary interest is in how network structure influencesprocesses that develop over a network
– spread of HIV and other STDs– diffusion of technical innovations– spread of computer viruses
Tertiary interest in the effect of interventions onnetwork structure and processes that develop over a network
![Page 31: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/31.jpg)
Perspectives to keep in mind
Network-specific versus Population-process
– Network-specific: interest focuses only on the actual networkunder study
– Population-process: the network is part of a populationof networks and the latter is the focus of interest
- the network is conceptualized as a realization of a socialprocess
![Page 32: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/32.jpg)
Statistical Models for Social Networks
NotationA social network is defined as a set of n social “actors” and a socialrelationship between each pair of actors.
Yij =
{1 relationship from actor i to actor j
0 otherwise
call Y ≡ [Yij ]n×n a sociomatrix
a N = n(n − 1) binary array
The basic problem of stochastic modeling is to specify a distributionfor Y i.e., P(Y = y)
![Page 33: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/33.jpg)
Statistical Models for Social Networks
NotationA social network is defined as a set of n social “actors” and a socialrelationship between each pair of actors.
Yij =
{1 relationship from actor i to actor j
0 otherwise
call Y ≡ [Yij ]n×n a sociomatrix
a N = n(n − 1) binary array
The basic problem of stochastic modeling is to specify a distributionfor Y i.e., P(Y = y)
![Page 34: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/34.jpg)
Statistical Models for Social Networks
NotationA social network is defined as a set of n social “actors” and a socialrelationship between each pair of actors.
Yij =
{1 relationship from actor i to actor j
0 otherwise
call Y ≡ [Yij ]n×n a sociomatrix
a N = n(n − 1) binary array
The basic problem of stochastic modeling is to specify a distributionfor Y i.e., P(Y = y)
![Page 35: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/35.jpg)
Statistical Models for Social Networks
NotationA social network is defined as a set of n social “actors” and a socialrelationship between each pair of actors.
Yij =
{1 relationship from actor i to actor j
0 otherwise
call Y ≡ [Yij ]n×n a sociomatrix
a N = n(n − 1) binary array
The basic problem of stochastic modeling is to specify a distributionfor Y i.e., P(Y = y)
![Page 36: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/36.jpg)
A Framework for Network Modeling
Let Y be the sample space of Y e.g. {0, 1}NAny model-class for the multivariate distribution of Ycan be parametrized in the form:
Pη(Y = y) =exp{η·g(y)}κ(η,Y)
y ∈ Y
Besag (1974), Frank and Strauss (1986)
η ∈ Λ ⊂ Rq q-vector of parameters
g(y) q-vector of network statistics.⇒ g(Y ) are jointly sufficient for the model
For a “saturated” model-class q = 2|Y| − 1
κ(η,Y) distribution normalizing constant
κ(η,Y) =∑y∈Y
exp{η·g(y)}
![Page 37: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/37.jpg)
Simple model-classes for social networks
Homogeneous Bernoulli graph (Renyi-Erdos model)
Yij are independent and equally likelywith log-odds η = logit[Pη(Yij = 1)]
Pη(Y = y) =eη
Pi,j yij
κ(η,Y)y ∈ Y
where q = 1, g(y) =∑
i,j yij , κ(η,Y) = [1 + exp(η)]N
homogeneity means it is unlikely to be proposed as a model for realphenomena
![Page 38: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/38.jpg)
Dyad-independence models with attributes
Yij are independent but depend on dyadic covariates xk,ij
Pη(Y = y) =e
Pqk=1 ηkgk (y)
κ(η,Y)y ∈ Y
gk(y) =∑i,j
xk,ijyij , k = 1, . . . , q
κ(η,Y) =∏i,j
[1 + exp(
q∑k=1
ηkxk,ij)]
Of course,logit[Pη(Yij = 1)] =
∑k
ηkxk,ij
![Page 39: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/39.jpg)
Dyad-independence models with attributes
Yij are independent but depend on dyadic covariates xk,ij
Pη(Y = y) =e
Pqk=1 ηkgk (y)
κ(η,Y)y ∈ Y
gk(y) =∑i,j
xk,ijyij , k = 1, . . . , q
κ(η,Y) =∏i,j
[1 + exp(
q∑k=1
ηkxk,ij)]
Of course,logit[Pη(Yij = 1)] =
∑k
ηkxk,ij
![Page 40: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/40.jpg)
Dyad-independence models with attributes
Yij are independent but depend on dyadic covariates xk,ij
Pη(Y = y) =e
Pqk=1 ηkgk (y)
κ(η,Y)y ∈ Y
gk(y) =∑i,j
xk,ijyij , k = 1, . . . , q
κ(η,Y) =∏i,j
[1 + exp(
q∑k=1
ηkxk,ij)]
Of course,logit[Pη(Yij = 1)] =
∑k
ηkxk,ij
![Page 41: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/41.jpg)
Some history of exponential family models for socialnetworks
Holland and Leinhardt (1981) proposed a general dyad independencemodel
– Also an homogeneous version they refer to as the “p1” model
Pη(Y = y) =exp{ρ
∑i<j yijyji + φy++ +
∑i αiyi+ +
∑j βjy+j}
κ(ρ, α, β, φ)
where η = (ρ, α, β, φ).
– φ controls the expected number of edges– ρ represent the expected tendency toward reciprocation– αi productivity of node i ; βj attractiveness of node j
Much related work and generalizations
![Page 42: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/42.jpg)
Some history of exponential family models for socialnetworks
Holland and Leinhardt (1981) proposed a general dyad independencemodel
– Also an homogeneous version they refer to as the “p1” model
Pη(Y = y) =exp{ρ
∑i<j yijyji + φy++ +
∑i αiyi+ +
∑j βjy+j}
κ(ρ, α, β, φ)
where η = (ρ, α, β, φ).
– φ controls the expected number of edges– ρ represent the expected tendency toward reciprocation– αi productivity of node i ; βj attractiveness of node j
Much related work and generalizations
![Page 43: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/43.jpg)
Classes of statistics used for modeling
Actor Markov statistics
⇒ Frank and Strauss (1986)
– motivated by notions of “symmetry” and “homogeneity”
– Yij in Y that do not share an actor areconditionally independent given the rest of the network
⇒ analogous to nearest neighbor ideas in spatial statistics
Degree distribution: dk(y) = proportion of actors of degree k in y .k-star distribution: sk(y) = proportion of k-stars in the graph y .(In particular,s1 = proportion of edges that exist between pairs of actors.)triangles:t1(y) = proportion of triads that from a complete sub-graph in y .
← →8
Classes of statistics used for modeling
1) Nodal Markov statistics ⇒ Frank and Strauss (1986)
– motivated by notions of “symmetry” and “homogeneity”– edges in Y that do not share an actor are
conditionally independent given the rest of the network⇒ analogous to nearest neighbor ideas in spatial statistics
• Degree distribution: dk(y) = proportion of nodes of degree k in y.
• k-star distribution: sk(y) = proportion of k-stars in the graph y.
• triangles: t1(y) = proportion of triangles in the graph y.
• ••
i j
h
....................................................................................................................................................................................................................................
..................................................................................................................
triangle= transitive triad
• ••
j1 j2
i
..................................................................................................................
..................................................................................................................
two-star
• •••
j1 j2
i
j3
....................................................................................
........................................................................
......................................................................................
three-star
⇐ Mark S. Handcock Statistical Modeling With ERGM →
Figure: Some configurations for non-directed graphs
![Page 44: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/44.jpg)
Classes of statistics used for modeling
Actor Markov statistics
⇒ Frank and Strauss (1986)
– motivated by notions of “symmetry” and “homogeneity”– Yij in Y that do not share an actor are
conditionally independent given the rest of the network
⇒ analogous to nearest neighbor ideas in spatial statistics
Degree distribution: dk(y) = proportion of actors of degree k in y .k-star distribution: sk(y) = proportion of k-stars in the graph y .(In particular,s1 = proportion of edges that exist between pairs of actors.)triangles:t1(y) = proportion of triads that from a complete sub-graph in y .
← →8
Classes of statistics used for modeling
1) Nodal Markov statistics ⇒ Frank and Strauss (1986)
– motivated by notions of “symmetry” and “homogeneity”– edges in Y that do not share an actor are
conditionally independent given the rest of the network⇒ analogous to nearest neighbor ideas in spatial statistics
• Degree distribution: dk(y) = proportion of nodes of degree k in y.
• k-star distribution: sk(y) = proportion of k-stars in the graph y.
• triangles: t1(y) = proportion of triangles in the graph y.
• ••
i j
h
....................................................................................................................................................................................................................................
..................................................................................................................
triangle= transitive triad
• ••
j1 j2
i
..................................................................................................................
..................................................................................................................
two-star
• •••
j1 j2
i
j3
....................................................................................
........................................................................
......................................................................................
three-star
⇐ Mark S. Handcock Statistical Modeling With ERGM →
Figure: Some configurations for non-directed graphs
![Page 45: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/45.jpg)
Classes of statistics used for modeling
Actor Markov statistics
⇒ Frank and Strauss (1986)
– motivated by notions of “symmetry” and “homogeneity”– Yij in Y that do not share an actor are
conditionally independent given the rest of the network⇒ analogous to nearest neighbor ideas in spatial statistics
Degree distribution: dk(y) = proportion of actors of degree k in y .k-star distribution: sk(y) = proportion of k-stars in the graph y .(In particular,s1 = proportion of edges that exist between pairs of actors.)triangles:t1(y) = proportion of triads that from a complete sub-graph in y .
← →8
Classes of statistics used for modeling
1) Nodal Markov statistics ⇒ Frank and Strauss (1986)
– motivated by notions of “symmetry” and “homogeneity”– edges in Y that do not share an actor are
conditionally independent given the rest of the network⇒ analogous to nearest neighbor ideas in spatial statistics
• Degree distribution: dk(y) = proportion of nodes of degree k in y.
• k-star distribution: sk(y) = proportion of k-stars in the graph y.
• triangles: t1(y) = proportion of triangles in the graph y.
• ••
i j
h
....................................................................................................................................................................................................................................
..................................................................................................................
triangle= transitive triad
• ••
j1 j2
i
..................................................................................................................
..................................................................................................................
two-star
• •••
j1 j2
i
j3
....................................................................................
........................................................................
......................................................................................
three-star
⇐ Mark S. Handcock Statistical Modeling With ERGM →
Figure: Some configurations for non-directed graphs
![Page 46: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/46.jpg)
Classes of statistics used for modeling
Actor Markov statistics
⇒ Frank and Strauss (1986)
– motivated by notions of “symmetry” and “homogeneity”– Yij in Y that do not share an actor are
conditionally independent given the rest of the network⇒ analogous to nearest neighbor ideas in spatial statistics
Degree distribution: dk(y) = proportion of actors of degree k in y .
k-star distribution: sk(y) = proportion of k-stars in the graph y .(In particular,s1 = proportion of edges that exist between pairs of actors.)triangles:t1(y) = proportion of triads that from a complete sub-graph in y .
← →8
Classes of statistics used for modeling
1) Nodal Markov statistics ⇒ Frank and Strauss (1986)
– motivated by notions of “symmetry” and “homogeneity”– edges in Y that do not share an actor are
conditionally independent given the rest of the network⇒ analogous to nearest neighbor ideas in spatial statistics
• Degree distribution: dk(y) = proportion of nodes of degree k in y.
• k-star distribution: sk(y) = proportion of k-stars in the graph y.
• triangles: t1(y) = proportion of triangles in the graph y.
• ••
i j
h
....................................................................................................................................................................................................................................
..................................................................................................................
triangle= transitive triad
• ••
j1 j2
i
..................................................................................................................
..................................................................................................................
two-star
• •••
j1 j2
i
j3
....................................................................................
........................................................................
......................................................................................
three-star
⇐ Mark S. Handcock Statistical Modeling With ERGM →
Figure: Some configurations for non-directed graphs
![Page 47: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/47.jpg)
Classes of statistics used for modeling
Actor Markov statistics
⇒ Frank and Strauss (1986)
– motivated by notions of “symmetry” and “homogeneity”– Yij in Y that do not share an actor are
conditionally independent given the rest of the network⇒ analogous to nearest neighbor ideas in spatial statistics
Degree distribution: dk(y) = proportion of actors of degree k in y .k-star distribution: sk(y) = proportion of k-stars in the graph y .(In particular,s1 = proportion of edges that exist between pairs of actors.)
triangles:t1(y) = proportion of triads that from a complete sub-graph in y .
← →8
Classes of statistics used for modeling
1) Nodal Markov statistics ⇒ Frank and Strauss (1986)
– motivated by notions of “symmetry” and “homogeneity”– edges in Y that do not share an actor are
conditionally independent given the rest of the network⇒ analogous to nearest neighbor ideas in spatial statistics
• Degree distribution: dk(y) = proportion of nodes of degree k in y.
• k-star distribution: sk(y) = proportion of k-stars in the graph y.
• triangles: t1(y) = proportion of triangles in the graph y.
• ••
i j
h
....................................................................................................................................................................................................................................
..................................................................................................................
triangle= transitive triad
• ••
j1 j2
i
..................................................................................................................
..................................................................................................................
two-star
• •••
j1 j2
i
j3
....................................................................................
........................................................................
......................................................................................
three-star
⇐ Mark S. Handcock Statistical Modeling With ERGM →
Figure: Some configurations for non-directed graphs
![Page 48: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/48.jpg)
Classes of statistics used for modeling
Actor Markov statistics
⇒ Frank and Strauss (1986)
– motivated by notions of “symmetry” and “homogeneity”– Yij in Y that do not share an actor are
conditionally independent given the rest of the network⇒ analogous to nearest neighbor ideas in spatial statistics
Degree distribution: dk(y) = proportion of actors of degree k in y .k-star distribution: sk(y) = proportion of k-stars in the graph y .(In particular,s1 = proportion of edges that exist between pairs of actors.)triangles:t1(y) = proportion of triads that from a complete sub-graph in y .
← →8
Classes of statistics used for modeling
1) Nodal Markov statistics ⇒ Frank and Strauss (1986)
– motivated by notions of “symmetry” and “homogeneity”– edges in Y that do not share an actor are
conditionally independent given the rest of the network⇒ analogous to nearest neighbor ideas in spatial statistics
• Degree distribution: dk(y) = proportion of nodes of degree k in y.
• k-star distribution: sk(y) = proportion of k-stars in the graph y.
• triangles: t1(y) = proportion of triangles in the graph y.
• ••
i j
h
....................................................................................................................................................................................................................................
..................................................................................................................
triangle= transitive triad
• ••
j1 j2
i
..................................................................................................................
..................................................................................................................
two-star
• •••
j1 j2
i
j3
....................................................................................
........................................................................
......................................................................................
three-star
⇐ Mark S. Handcock Statistical Modeling With ERGM →
Figure: Some configurations for non-directed graphs
![Page 49: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/49.jpg)
Other statistics motivated by conditional independence
⇒ Pattison and Robins (2002), Butts (2005)⇒ Snijders, Pattison, Robins and Handcock (2004)
– Yuj and Yiv in Y are conditionally
independent given the rest of the networkif they could not produce a cycle in the network
New specifications for ERGMs
• •
• •
i v
u j
........
........
........
........
........
........
........
........
........
........
........
........
........
........
..
........
........
........
........
........
........
........
........
........
........
........
........
........
........
..
............. ............. ............. ............. .............
............. ............. ............. ............. .............
Figure 2: Partial conditional dependence when four-cycle is created
(see Figure 2). This partial conditional independence assumption states thattwo possible edges with four distinct nodes are conditionally dependent when-ever their existence in the graph would create a four-cycle. One substantiveinterpretation is that the possibility of a four-cycle establishes the structuralbasis for a “social setting” among four individuals (Pattison and Robins,2002), and that the probability of a dyadic tie between two nodes (here, iand v) is a!ected not just by the other ties of these nodes but also by otherties within such a social setting, even if they do not directly involve i and v.
A four-cycle assumption is a natural extension of modeling based on tri-angles (three-cycles), and was first used by Lazega and Pattison (1999) inan examination of whether such larger cycles could be observed in an empir-ical setting to a greater extent than could be accounted for by parametersfor configurations involving at most 3 nodes. Let us consider the four-cycleassumption alongside the Markov dependence. Under the Markov assump-tion, Yiv is conditionally dependent on each of Yiu, Yuv, Yij and Yjv, becausethese edge indicators share a node. So if yiu = yjv = 1 (the precondition inthe four-cycle partial conditional dependence), then all five of these possibleedges can be mutually dependent, and hence the exponential model (4) couldcontain a parameter corresponding to the count of such configurations. Weterm this configuration, given by
yiv = yiu = yij = yuv = yjv = 1 ,
a two-triangle (see Figure 3). It represents the edge yij = 1 as part of thetriadic setting yij = yiv = yjv = 1 as well as the setting yij = yiu = yju = 1.
Motivated by this approach, we introduce here a generalization of triadicstructures in the form of graph configurations that we term k-triangles. Fora non-directed graph, a k-triangle with base (i, j) is defined by the presenceof a base edge i ! j together with the presence of at least k other nodesadjacent to both i and j. We denote a ‘side’ of a k-triangle as any edge thatis not the base. The integer k is called the order of the k-triangle Thus ak-triangle is a combination of k individual triangles, each sharing the sameedge i! j. The concept of a k-triangle can be seen as a triadic analogue of a
15
![Page 50: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/50.jpg)
Other statistics motivated by conditional independence
⇒ Pattison and Robins (2002), Butts (2005)⇒ Snijders, Pattison, Robins and Handcock (2004)
– Yuj and Yiv in Y are conditionally
independent given the rest of the networkif they could not produce a cycle in the networkNew specifications for ERGMs
• •
• •
i v
u j
........
........
........
........
........
........
........
........
........
........
........
........
........
........
..
........
........
........
........
........
........
........
........
........
........
........
........
........
........
..
............. ............. ............. ............. .............
............. ............. ............. ............. .............
Figure 2: Partial conditional dependence when four-cycle is created
(see Figure 2). This partial conditional independence assumption states thattwo possible edges with four distinct nodes are conditionally dependent when-ever their existence in the graph would create a four-cycle. One substantiveinterpretation is that the possibility of a four-cycle establishes the structuralbasis for a “social setting” among four individuals (Pattison and Robins,2002), and that the probability of a dyadic tie between two nodes (here, iand v) is a!ected not just by the other ties of these nodes but also by otherties within such a social setting, even if they do not directly involve i and v.
A four-cycle assumption is a natural extension of modeling based on tri-angles (three-cycles), and was first used by Lazega and Pattison (1999) inan examination of whether such larger cycles could be observed in an empir-ical setting to a greater extent than could be accounted for by parametersfor configurations involving at most 3 nodes. Let us consider the four-cycleassumption alongside the Markov dependence. Under the Markov assump-tion, Yiv is conditionally dependent on each of Yiu, Yuv, Yij and Yjv, becausethese edge indicators share a node. So if yiu = yjv = 1 (the precondition inthe four-cycle partial conditional dependence), then all five of these possibleedges can be mutually dependent, and hence the exponential model (4) couldcontain a parameter corresponding to the count of such configurations. Weterm this configuration, given by
yiv = yiu = yij = yuv = yjv = 1 ,
a two-triangle (see Figure 3). It represents the edge yij = 1 as part of thetriadic setting yij = yiv = yjv = 1 as well as the setting yij = yiu = yju = 1.
Motivated by this approach, we introduce here a generalization of triadicstructures in the form of graph configurations that we term k-triangles. Fora non-directed graph, a k-triangle with base (i, j) is defined by the presenceof a base edge i ! j together with the presence of at least k other nodesadjacent to both i and j. We denote a ‘side’ of a k-triangle as any edge thatis not the base. The integer k is called the order of the k-triangle Thus ak-triangle is a combination of k individual triangles, each sharing the sameedge i! j. The concept of a k-triangle can be seen as a triadic analogue of a
15
![Page 51: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/51.jpg)
This produces statistics of the form:
edgewise shared partner distribution: espk(y) =proportion of edges between actors with exactly k shared partners
k = 0, 1, . . .
! "9
2) Other conditional independence statistics
# Pattison and Robins (2002), Butts (2005)
# Snijders, Pattison, Robins and Handcock (2004)
– edges in Y that are not tied are conditionally
independent given the rest of the network
• k-triangle distribution: tk(y) = proportion of k-triangles in the graph y.
• edgewise shared partner distribution:
pk(y) = propotion of nodes with exactly k edgewise shared partners in y.
•
•• • • • •
i
j
h1 h2 h3 h4 h5....................................................................................................................................................
............................................................................................................................
..........................................................................................................................
..................................................................................................................................................
................................................................
..................................................................................................................................................................................................................
..............................................................................................................................................................................................................
...........................................................................................................
.........................................................................................................................................................................................................................................................................................................................
........................................................................................................................................................................................................................................................................
.........................................................................................................................................................
.................................................................................................................................................................................................................................................................................................................................................................................................................................
........................................................................................................................................................................................................................................................................................................................................
................................................................................................................................................................................................................
........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
k-triangle for k = 5, i.e., 5-triangle
$ Mark S. Handcock Statistical Modeling With ERGM "
Figure: The actors in the non-directed (i , j) edge have 5 shared partners
dyadwise shared partner distribution:dspk(y) = proportion of dyads with exactly k shared partners
k = 0, 1, . . .
![Page 52: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/52.jpg)
Structural Signatures
– identify social constructs or features– based on intuitive notions or partial appeal to substantive theory
Clusters of edges are often transitive:Recall t1(y) is the proportion of triangles amongst triads
t1(y) =1(g3
) ∑{i,j,k}∈(g
3)
yijyikyjk
A closely related quantity is theproportion of triangles amongst 2-stars
C (y) =3×t1(y)
s2(y)
Also called mean clustering coefficient
![Page 53: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/53.jpg)
Structural Signatures
– identify social constructs or features– based on intuitive notions or partial appeal to substantive theory
Clusters of edges are often transitive:Recall t1(y) is the proportion of triangles amongst triads
t1(y) =1(g3
) ∑{i,j,k}∈(g
3)
yijyikyjk
A closely related quantity is theproportion of triangles amongst 2-stars
C (y) =3×t1(y)
s2(y)
Also called mean clustering coefficient
![Page 54: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/54.jpg)
Structural Signatures
– identify social constructs or features– based on intuitive notions or partial appeal to substantive theory
Clusters of edges are often transitive:Recall t1(y) is the proportion of triangles amongst triads
t1(y) =1(g3
) ∑{i,j,k}∈(g
3)
yijyikyjk
A closely related quantity is theproportion of triangles amongst 2-stars
C (y) =3×t1(y)
s2(y)
Also called mean clustering coefficient
![Page 55: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/55.jpg)
Structural Signatures
– identify social constructs or features– based on intuitive notions or partial appeal to substantive theory
Clusters of edges are often transitive:Recall t1(y) is the proportion of triangles amongst triads
t1(y) =1(g3
) ∑{i,j,k}∈(g
3)
yijyikyjk
A closely related quantity is theproportion of triangles amongst 2-stars
C (y) =3×t1(y)
s2(y)
Also called mean clustering coefficient
![Page 56: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/56.jpg)
Example: A simple model-class with transitivity
n = 50 actors N = 1225 pairs 10369 graphs
P(Y = y) =exp{η1E (y) + η2C (y)}
κ(η1, η2)y ∈ Y
where
E (x) is the density of edges (0 – 1)C (x) is the triangle percent (0 – 100)
If we set the density of the graph to have about 50 edges then theexpected triangle percent is 3.8%
Suppose we set the triangle percent large to reflect transitivity in thegraph: 38%
![Page 57: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/57.jpg)
How can we tell if the model is useful?
Does this model capture transitivity and density in a flexible way?
By construction, on average, graphs from this model have averagedensity 4% and average triangle percent 38%
If the model is a good representation of transitivity and density weexpect the graphs drawn from the model to be close to these values.
What do graphs produced by this model look like?
![Page 58: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/58.jpg)
How can we tell if the model is useful?
Does this model capture transitivity and density in a flexible way?
By construction, on average, graphs from this model have averagedensity 4% and average triangle percent 38%
If the model is a good representation of transitivity and density weexpect the graphs drawn from the model to be close to these values.
What do graphs produced by this model look like?
![Page 59: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/59.jpg)
0.01 0.02 0.03 0.04 0.05 0.06 0.07
020
4060
8010
0
Distribution of Graphs from this model
density of the graph
perc
ent t
rans
itive
trip
les
in th
e gr
aph
target density and transitivity
![Page 60: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/60.jpg)
Curved Exponential Family Models
Suppose that η is modeled as a function of a lower dimensionalparameter: θ ∈ Rp
P(Y = y) =exp{η(θ)·g(y)}
κ(θ,Y)y ∈ Y
Hunter and Handcock (2004)
Suppose we focus on a model for network degree distribution andclustering
log [Pθ(Y = y)] = η(φ) · d(y) + νC (y)− log c(φ, ν,Y), (1)
where d(x) = {d1(x), . . . , dn−1(x)} are the network degree distributioncounts.Any degree distribution can be specified by n − 1 or less independentparameters.
![Page 61: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/61.jpg)
Curved Exponential Family Models
Suppose that η is modeled as a function of a lower dimensionalparameter: θ ∈ Rp
P(Y = y) =exp{η(θ)·g(y)}
κ(θ,Y)y ∈ Y
Hunter and Handcock (2004)
Suppose we focus on a model for network degree distribution andclustering
log [Pθ(Y = y)] = η(φ) · d(y) + νC (y)− log c(φ, ν,Y), (1)
where d(x) = {d1(x), . . . , dn−1(x)} are the network degree distributioncounts.
Any degree distribution can be specified by n − 1 or less independentparameters.
![Page 62: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/62.jpg)
Curved Exponential Family Models
Suppose that η is modeled as a function of a lower dimensionalparameter: θ ∈ Rp
P(Y = y) =exp{η(θ)·g(y)}
κ(θ,Y)y ∈ Y
Hunter and Handcock (2004)
Suppose we focus on a model for network degree distribution andclustering
log [Pθ(Y = y)] = η(φ) · d(y) + νC (y)− log c(φ, ν,Y), (1)
where d(x) = {d1(x), . . . , dn−1(x)} are the network degree distributioncounts.Any degree distribution can be specified by n − 1 or less independentparameters.
![Page 63: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/63.jpg)
Statistical Inference for η
Base inference on the loglikelihood function,
`(η) = η·g(yobs)− log κ(η)
κ(η) =∑
all possible
graphs z
exp{η·g(z)}
![Page 64: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/64.jpg)
Mean-value representation of the model
Let Pν(K = k) be the PMF of K , the number of ties that a randomlychosen node in the network has.An alternative parameterization: (φ, ρ) where the mapping is:
ρ = Eφ,ρ [C (X )] =∑y∈Y
C (y) exp [η(φ) · d(y) + νC (y)] ≥ 0 (2)
Pν(K = k) = Eφ,ρ [dk(Y )] k = 0, . . . , n − 1 (3)
– ρ is the mean clustering coefficient over networks in Y.– ν controls the parametrization of the degree distribution
![Page 65: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/65.jpg)
Illustrations of good models within this model-class
village-level structure
– n = 50– mean clustering coefficient = 15% – degree distribution: Yule withscaling exponent 3.
larger-level structure
– n = 1000– mean clustering coefficient = 15% – degree distribution: Yule withscaling exponent 3.
Attribute mixing
– Two-sex populations– mean clustering coefficient = 15% – degree distribution: Yule withscaling exponent 3.
![Page 66: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/66.jpg)
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Yule with zero clustering coefficient conditional on degree
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Yule with clustering coefficient 15%
● ●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
● ●
●
●●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●●
●
●●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
● ●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
Yule with zero clustering coefficient conditional on degree
●
●
●
●
●
●
●
●●
●●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
● ●
●
●●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●●
●
●●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
● ●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●●
●●
●
●
●
● ●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
● ●
●
● ●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Yule with clustering coefficient 15%
●●
● ●
●
●
●●
●●
●
●
●
● ●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
● ●
●
● ●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
![Page 67: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/67.jpg)
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
Heterosexual Yule with no correlation
tripercent = 3
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
Heterosexual Yule with strong correlation
tripercent = 60.6
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
Heterosexual Yule with modest correlation
tripercent = 0
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●● ●
●
●●
●
●
●
●
●●
●
●●
●
● ●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
Heterosexual Yule with negative correlation
tripercent = 0
●● ●
●
●●
●
●
●
●
●●
●
●●
●
● ●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
![Page 68: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/68.jpg)
Application to a Protein-Protein Interaction Network
By interact is meant that two amino acid chains were experimentallyidentified to bind to each other.
The network is for E. Coli and is drawn from the “Database ofInteracting Proteins (DIP)” http://dip.doe-mbi.ucla.edu
For simplicity we focus on proteins that interact with themselves andhave at least one other interaction– 108 proteins and 94 interactions.
![Page 69: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/69.jpg)
Figure: A protein - protein interaction network for E. Coli. The nodesrepresent proteins and the ties indicate that the two proteins are known tointeract with each other.
![Page 70: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/70.jpg)
Statistical Inference and Simulation
Simulate using a Metropolis-Hastings algorithm (Handcock 2002).
Here base inference on the likelihood function
For computational reasons, approximate the likelihood via MarkovChain Monte Carlo (MCMC)
Use maximum likelihood estimates (Geyer and Thompson 1992)
Parameter est. s.e.Scaling decay rate (φ) 3.034 0.3108Correlation Coefficient (ν) 1.176 0.1457
Table: MCMC maximum likelihood parameter estimates for the protein-proteininteraction network.
![Page 71: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/71.jpg)
Approximating the loglikelihood
Suppose Y1,Y2, . . . ,Ymi.i.d.∼ Pη0 (Y = y) for some η0.
Using the LOLN, the difference in log-likelihoods is
`(η)− `(η0) = logκ(η0)
κ(η)
= log Eη0 (exp {(η0 − η)·g(Y )})
≈ log1
M
M∑i=1
exp {(η0 − η)·(g(Yi )− g(yobs))}
≡ ˜(η)− ˜(η0).
Simulate Y1,Y2, . . . ,Ym using a MCMC (Metropolis-Hastings)algorithm ⇒ Handcock (2002).
Approximate the MLE η = argmaxη{˜(η)− ˜(η0)} (MC-MLE)⇒ Geyer and Thompson (1992)
![Page 72: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/72.jpg)
Approximating the loglikelihood
Suppose Y1,Y2, . . . ,Ymi.i.d.∼ Pη0 (Y = y) for some η0.
Using the LOLN, the difference in log-likelihoods is
`(η)− `(η0) = logκ(η0)
κ(η)
= log Eη0 (exp {(η0 − η)·g(Y )})
≈ log1
M
M∑i=1
exp {(η0 − η)·(g(Yi )− g(yobs))}
≡ ˜(η)− ˜(η0).
Simulate Y1,Y2, . . . ,Ym using a MCMC (Metropolis-Hastings)algorithm ⇒ Handcock (2002).
Approximate the MLE η = argmaxη{˜(η)− ˜(η0)} (MC-MLE)⇒ Geyer and Thompson (1992)
![Page 73: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/73.jpg)
Approximating the loglikelihood
Suppose Y1,Y2, . . . ,Ymi.i.d.∼ Pη0 (Y = y) for some η0.
Using the LOLN, the difference in log-likelihoods is
`(η)− `(η0) = logκ(η0)
κ(η)
= log Eη0 (exp {(η0 − η)·g(Y )})
≈ log1
M
M∑i=1
exp {(η0 − η)·(g(Yi )− g(yobs))}
≡ ˜(η)− ˜(η0).
Simulate Y1,Y2, . . . ,Ym using a MCMC (Metropolis-Hastings)algorithm ⇒ Handcock (2002).
Approximate the MLE η = argmaxη{˜(η)− ˜(η0)} (MC-MLE)⇒ Geyer and Thompson (1992)
![Page 74: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/74.jpg)
Approximating the loglikelihood
Suppose Y1,Y2, . . . ,Ymi.i.d.∼ Pη0 (Y = y) for some η0.
Using the LOLN, the difference in log-likelihoods is
`(η)− `(η0) = logκ(η0)
κ(η)
= log Eη0 (exp {(η0 − η)·g(Y )})
≈ log1
M
M∑i=1
exp {(η0 − η)·(g(Yi )− g(yobs))}
≡ ˜(η)− ˜(η0).
Simulate Y1,Y2, . . . ,Ym using a MCMC (Metropolis-Hastings)algorithm ⇒ Handcock (2002).
Approximate the MLE η = argmaxη{˜(η)− ˜(η0)} (MC-MLE)⇒ Geyer and Thompson (1992)
![Page 75: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/75.jpg)
Approximating the loglikelihood
Suppose Y1,Y2, . . . ,Ymi.i.d.∼ Pη0 (Y = y) for some η0.
Using the LOLN, the difference in log-likelihoods is
`(η)− `(η0) = logκ(η0)
κ(η)
= log Eη0 (exp {(η0 − η)·g(Y )})
≈ log1
M
M∑i=1
exp {(η0 − η)·(g(Yi )− g(yobs))}
≡ ˜(η)− ˜(η0).
Simulate Y1,Y2, . . . ,Ym using a MCMC (Metropolis-Hastings)algorithm ⇒ Handcock (2002).
Approximate the MLE η = argmaxη{˜(η)− ˜(η0)} (MC-MLE)⇒ Geyer and Thompson (1992)
![Page 76: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/76.jpg)
Approximating the loglikelihood
Suppose Y1,Y2, . . . ,Ymi.i.d.∼ Pη0 (Y = y) for some η0.
Using the LOLN, the difference in log-likelihoods is
`(η)− `(η0) = logκ(η0)
κ(η)
= log Eη0 (exp {(η0 − η)·g(Y )})
≈ log1
M
M∑i=1
exp {(η0 − η)·(g(Yi )− g(yobs))}
≡ ˜(η)− ˜(η0).
Simulate Y1,Y2, . . . ,Ym using a MCMC (Metropolis-Hastings)algorithm ⇒ Handcock (2002).
Approximate the MLE η = argmaxη{˜(η)− ˜(η0)} (MC-MLE)⇒ Geyer and Thompson (1992)
![Page 77: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/77.jpg)
Modeling Network Dynamics
Suppose we wish to represent the dynamics at t = 0, 1, . . . ,T timepoints
Yijt =
{1 relationship from actor i to actor j at time t
0 otherwise
Need a model that
– has the correct cross-sectional statistics– has the correct durations for relationships– realistic dissolution and formation of relationships
![Page 78: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/78.jpg)
Modeling Network Dynamics
Suppose we wish to represent the dynamics at t = 0, 1, . . . ,T timepoints
Yijt =
{1 relationship from actor i to actor j at time t
0 otherwise
Need a model that
– has the correct cross-sectional statistics– has the correct durations for relationships– realistic dissolution and formation of relationships
![Page 79: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/79.jpg)
A Naive Model for Longitudinal Network Data
Consider a dynamic variant of the cross-sectional ERGM:
Pη(Yt+1 = yt+1|Yt = yt) =exp
(ηt+1·g
(yt+1); yt
))∑s∈Y exp (ηt+1·g (x ; yt))
t = 2, . . . ,T
where gk(yt+1; yt) are statistics formed from yt+1 given yt
– Robins and Pattison (2000) Discrete temporal ERGM– Morris and Handcock (2001) Discrete temporal ERGM– Hanneke and Xing (2006) Discrete temporal ERGM– Guo, Hanneke, Fu and Xing (2007)] Hidden temporal ERGM
![Page 80: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/80.jpg)
Two-Phase Dynamic Model
Consider a Markovian model with transition probabilities from Yt to Yt+1
governed by simultaneousdissolution and formation
phases
![Page 81: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/81.jpg)
Formation Phase
Pr(Y+ = y+|Y0 = y0;β) =eβ·g+(y+,y0)1y+⊇y0
c+(β, y0), y+ ∈ Y
Y0 −→ Y+
β completely controls the incidence
![Page 82: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/82.jpg)
Dissolution Phase
Pr(Y− = y−|Y0 = y0; γ) =eγ·g−(y−,y0)1y−⊆y0
c−(γ, y0), y− ∈ Y
Y0 −→ Y−
γ complete controls the durations of partnerships
![Page 83: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/83.jpg)
Simultaneous Formation and Dissolution
![Page 84: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/84.jpg)
Simultaneous Formation and DissolutionTransition Probability
Pr(Y1 = y1|Y0 = y0;β, γ) = p−(y1 ∩ y0|y0; γ)× p+(y1 ∪ y0|y0;β)
![Page 85: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/85.jpg)
Markov Process
Y0→ Y1→ Y2→ Y3→ . . .
![Page 86: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/86.jpg)
Equilibrium Distribution
YtD→ Y ∼ Pr(Y = y ; β, γ)
![Page 87: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/87.jpg)
Back to Prevalence
Prevalence = Incidence × Duration
|| || ||Equilibrium Formation Dissolution
![Page 88: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/88.jpg)
Back to Prevalence
Prevalence = Incidence × Duration
||
||
||Equilibrium
Formation
Dissolution
![Page 89: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/89.jpg)
Back to Prevalence
Prevalence = Incidence × Duration
|| ||
||
Equilibrium Formation
Dissolution
![Page 90: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/90.jpg)
Back to Prevalence
Prevalence = Incidence × Duration||
|| ||
Equilibrium
Formation Dissolution
![Page 91: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/91.jpg)
Back to Prevalence
Prevalence = Incidence × Duration|| || ||
Equilibrium Formation Dissolution
![Page 92: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/92.jpg)
Application to the Dynamics of HIV Spread
Data: The National Longitudinal Study of Adolescent Health
– Wave III (with retrospective duration information)– Take into account age, sex, race (white/non-white)
and age-sex “mixing” patterns
Estimate the parameters of the model based on the likelihood.
Consider a (quasi)population of 10000 people (about half men andwomen)
Simulate dynamics of sexual networks over 10 years
– the time step is daily (3650 steps)
Simulate disease spread based on 10 “seeds” (2 non-white)
– as daily have good control over micro-structure of transmission
Visualize only those that become infected
![Page 93: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/93.jpg)
Application to the Dynamics of HIV Spread
Data: The National Longitudinal Study of Adolescent Health
– Wave III (with retrospective duration information)– Take into account age, sex, race (white/non-white)
and age-sex “mixing” patterns
Estimate the parameters of the model based on the likelihood.
Consider a (quasi)population of 10000 people (about half men andwomen)
Simulate dynamics of sexual networks over 10 years
– the time step is daily (3650 steps)
Simulate disease spread based on 10 “seeds” (2 non-white)
– as daily have good control over micro-structure of transmission
Visualize only those that become infected
![Page 94: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/94.jpg)
Application to the Dynamics of HIV Spread
Data: The National Longitudinal Study of Adolescent Health
– Wave III (with retrospective duration information)– Take into account age, sex, race (white/non-white)
and age-sex “mixing” patterns
Estimate the parameters of the model based on the likelihood.
Consider a (quasi)population of 10000 people (about half men andwomen)
Simulate dynamics of sexual networks over 10 years
– the time step is daily (3650 steps)
Simulate disease spread based on 10 “seeds” (2 non-white)
– as daily have good control over micro-structure of transmission
Visualize only those that become infected
![Page 95: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/95.jpg)
Application to the Dynamics of HIV Spread
Data: The National Longitudinal Study of Adolescent Health
– Wave III (with retrospective duration information)– Take into account age, sex, race (white/non-white)
and age-sex “mixing” patterns
Estimate the parameters of the model based on the likelihood.
Consider a (quasi)population of 10000 people (about half men andwomen)
Simulate dynamics of sexual networks over 10 years
– the time step is daily (3650 steps)
Simulate disease spread based on 10 “seeds” (2 non-white)
– as daily have good control over micro-structure of transmission
Visualize only those that become infected
![Page 96: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/96.jpg)
Application to the Dynamics of HIV Spread
Data: The National Longitudinal Study of Adolescent Health
– Wave III (with retrospective duration information)– Take into account age, sex, race (white/non-white)
and age-sex “mixing” patterns
Estimate the parameters of the model based on the likelihood.
Consider a (quasi)population of 10000 people (about half men andwomen)
Simulate dynamics of sexual networks over 10 years
– the time step is daily (3650 steps)
Simulate disease spread based on 10 “seeds” (2 non-white)
– as daily have good control over micro-structure of transmission
Visualize only those that become infected
![Page 97: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/97.jpg)
Application to the Dynamics of HIV Spread
Data: The National Longitudinal Study of Adolescent Health
– Wave III (with retrospective duration information)– Take into account age, sex, race (white/non-white)
and age-sex “mixing” patterns
Estimate the parameters of the model based on the likelihood.
Consider a (quasi)population of 10000 people (about half men andwomen)
Simulate dynamics of sexual networks over 10 years
– the time step is daily (3650 steps)
Simulate disease spread based on 10 “seeds” (2 non-white)
– as daily have good control over micro-structure of transmission
Visualize only those that become infected
![Page 98: Statistical Models for Social Networks with Application to HIV …murphyk/nips07NetworkWorkshop/... · 2008-02-18 · Statistical Models for Social Networks with Application to HIV](https://reader033.fdocuments.us/reader033/viewer/2022042008/5e7106f55b0fc96ced608056/html5/thumbnails/98.jpg)
Conclusions and Challenges
Network models are a very constructive way to represent (social)theory
Some seemingly simple models are not so.
Large and deep literatures exist are often ignored
Simple models are being used to capture structural properties
The inclusion of attributes is very important
– actor attributes– dyad attributes e.g. homophily, race, location– structural terms e.g. transitive homophily
Software: A suite of R packages to implement this is available:statnetproject.org
See the papers at:statnetproject.org/users guide.shtmlTo appear as a special Issue of the Journal of Statistical Software,Volume 24.