University of Amsterdam
MSc Theoretical Physics
Master Thesis
Social multiplexity in a generalizedAxelrod model of cultural dissemination
Author:
Arjen Aerts
Supervisor:
dr. Diego Garlaschelli
April 8, 2015
Social multiplexity in a generalized Axelrod model of cultural dissemination
Arjen Aerts
Abstract
Social multiplexity is a ubiquitous feature of human social life. In this thesis it is investigated
what the effect of social multiplexity is on cultural dynamics in terms of cultural convergence,
using a generalized version of the Axelrod model which incorporates network multiplexity and
bounded confidence. This is mostly a computational study; where possible, analytical results are
established as well. First, in the end-state the effect of having multiple networks on a phase
transition, with the confidence threshold as control parameter, is studied for the following
scenarios: random graphs, network-culture assortativity, updating networks and initial realistic
cultures. Second, using the same scenarios, the model dynamics are explicitly analyzed for some
values of the threshold. Third, attention is paid to the effect of cultural evolution on the
underlying social networks in the presence of network updating. It is found that the effect of
multiplexity differs between treatments, but in most cases promotes cultural convergence. An
important mechanism is that local differences in connectivity between the layers lead indirectly
to more cultural convergence, while increasing the time to reach the end-state. When
assortativity is present, the effect of multiplexity becomes non-monotone. Network updating
reduces cultural convergence and induces network dynamics that strongly depend on the
confidence threshold. In the case of realistic initial cultures more diverse behavior is shown with
an additional phase between full cultural divergence and full cultural convergence. This phase
turns out to be unstable when multiplexity is present. Moreover, in the second phase transition
non-equilibrium behavior in the dynamics is shown to result from competition between the
layers. Finally, in most cases the layer networks did not resemble realistic social networks.
Contents
1 Introduction 4
2 Literature review 5
2.1 Original Axelrod model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Extensions of the Axelrod model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 Other extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Social multiplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 The model 12
3.1 Network-culture assortativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1.1 Properties of the network . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Generalized Axelrod model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3 End-state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4 Simulation set-up 14
4.1 Routes of investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.1.1 Clusterings of the culture: global and layer dependent . . . . . . . . . . . . 16
4.1.2 Random initial culture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.3 Structured initial culture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 End-state analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2.1 Cluster size entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3 Dynamical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3.1 Network observables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3.2 Variation of information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5 Results and discussion 24
5.1 Treatment 0: trivial multiplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.1.1 End-state results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.1.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2 Treatment 1: random culture and random, static networks . . . . . . . . . . . . . 26
5.2.1 End-state results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.2.2 Dynamical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.3 Treatment 2: network-culture assortativity . . . . . . . . . . . . . . . . . . . . . . . 31
5.3.1 End-state results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.3.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.4 Treatment 3: updating networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.4.1 End-state results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.4.2 Dynamical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.5 Treatment 4: structured culture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.5.1 End-state results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.5.2 Dynamical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.5.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6 Conclusion 45
1 Introduction
The existence of systems with multiple layers of networks interacting with each other can have
profound impact on the dynamics of the system. Indeed, in 2003 a powerful electricity failure
hit Italy which was particularly strong because the electricity network was coupled to the internet
communication network [Buldyrev et al., 2010]. Multiple interacting networks or, more specifically,
multiplices (i.e. networks that have different types of connections in different layers) are ubiquitous
features of complex systems and social systems are no exception [Boccaletti et al., 2014].
In social science, multiplexity is an important area of study, since interaction in social systems
is often within different (evolving) social environments. However, due to their complexity they have
not been studied extensively, short of some recent examples [Quattrociocchi et al., 2014,Palchykov
et al., 2014]. On the other hand, much work has been done on studying the Axelrod model,
a stochastic cellular automaton based on principles from social science that describes cultural
dynamics (i.e. the evolution of culture), where culture is represented as a stylized object [Axelrod,
1997,Castellano et al., 2000,Klemm et al., 2003b].
This thesis will mainly investigate the effect of social multiplexity on cultural dynamics in
terms of cultural convergence, using a generalized version of the Axelrod model. In addition, some
attention will be paid to the effect of this cultural evolution on the underlying social networks. The
thesis uses an interdisciplinary statistical-physics approach to understand these aspects. Studying
cultural dynamics is important for understanding the causes of cultural diversity and collective
social behavior.
The generalized Axelrod model that is used here incorporates the presence of network multi-
plexity and bounded confidence, in addition to optional features: network assortativity, updating
networks and empirically realistic initial cultures. It aims to model the interaction of social and
cultural dynamics, with an association between social networks and subcultures. Since it is diffi-
cult to study the model analytically even in its simplest form, simulation will be used to study the
model. An advantage of this is that many different scenarios can be studied in a similar way. In
addition, analytical results will be established where possible.
Note that an example of social multiplexity has already been studied in the context of the
Axelrod model by coupling the Axelrod model with social resource sharing dynamics [Huang and
Liu, 2010]; this is a form of social multiplexity that does not imply having multiple networks. In
the current work, social multiplexity is not of this kind, because the multiplexity is part of the
Axelrod model and implies having multiple networks. When referring to social multiplexity, it is
always meant here that the multiplexity concerns multiple networks (or layers).
For different regions of the model’s parameter space, both the end-state and the dynamical
behavior are investigated in terms of the effect of having multiple layers. More specifically, the
effect of multiplexity on the phase transition (with the confidence threshold as control parameter)
is studied for the following scenarios: random graphs, network assortativity, updating networks
4
and initial realistic cultures. In addition, the effect of the Axelrod model on the structure of the
networks is also investigated when the networks update.
The rest of the thesis is outlined as follows. In Section 2 parts of the literature on the Ax-
elrod model are reviewed. Then, Section 3 will present the model, while we elaborate on the
simulation set-up in Section 4. The results and discussion are covered in Section 5. More specifi-
cally, Subsection 5.1 finds that the generalized Axelrod model introduces a compartmentalization
that has some effect on its dynamical behavior. Subsection 5.2 shows that in the simplest case
multiplexity leads to more cultural convergence, but this originates from multiple effects, some of
which counteract cultural convergence. The most important mechanism is that local differences in
connectivity between the layers lead indirectly to more cultural convergence, while increasing the
time to reach the end-state. In addition, Subsection 5.3 finds that assortativity largely promotes
cultural convergence, but introduces non-monotonicity in the effect of multiplexity. Updating net-
works, investigated in Subsection 5.4, reduce cultural convergence by decreasing opportunities for
interaction with culturally more distant individuals at later times, so that the system settles down
faster. Furthermore, for some values of the confidence threshold there is an indication of the for-
mation of realistic networks over time. Then, in Subsection 5.5 it is found that the extra phase
that is created when using the structured initial culture becomes less stable when multiplexity
increases. Also, the effect of multiplexity on cultural convergence depends heavily on the confi-
dence threshold. Moreover, for some parameter values the system sometimes shows non-monotonic
dynamical behavior, paving the way for systematic non-equilibrium dynamics in the generalized
Axelrod model. Finally, Section 6 concludes the thesis.
2 Literature review
The model that will be discussed in Section 3 builds on several modeling paradigms. In this section
their place in the literature is discussed and some notation will be introduced.
2.1 Original Axelrod model
Cultural dynamics should strictly be viewed as a (generalized) opinion model (i.e. a model that
describes the evolution of agents’ opinions over time). In the literature, however, a distinction is
often made between opinion dynamics and cultural dynamics, where the former concerns opinion
models that use scalar variables, while the latter treats opinion models that have a vector of such
variables [Castellano et al., 2009]. Prominent examples of (scalar) opinion models are the Voter
model [Clifford and Sudbury, 1973] and the Deffuant model [Deffuant et al., 2000], where the
former treats an opinion as a binary variable (similar to the Ising model) and the latter treats an
opinion as a continuous variable, taking values in the unit interval.
The most widely used framework for studying cultural dynamics is the Axelrod model [Axelrod,
5
1997]. It generalizes the voter model in two ways: first, it has more than one opinion variable (called
cultural features and the full set of cultural features is a cultural vector) and, secondly, it allows
more than two values per cultural feature (called cultural traits). The dynamics (which will be
outlined later in this subsection) are based on the assumption that similar (in terms of the cultural
vectors) individuals interact more (homophily) and interaction leads to higher similarity (social
influence), which are fundamental principles in social science. Much like the dynamics used for the
Ising model (e.g. Glauber, Metropolis), this constitutes a relaxation process, whereby the system
converges steadily towards an equilibrium.
In the original Axelrod model the most fundamental object is the cultural configuration (or
culture), which consists of elements that live in a cultural space. Formally, a cultural space is a
finite, discrete space (and, therefore, it is compact); an element from this space is called a cultural
vector v = {v1, ..., vF }, where F is the dimension of the space, vk is called a cultural feature and
vk ∈ {1, ..., q} for k = 1, ..., F and q is some positive integer (i.e. each feature has q possible traits).
A cultural feature could be any cultural property, for example the agent’s religion or its taste for
wine. The state-space of the model (i.e. the culture) then consists of N actors (or agents), where
each agent i is associated to a cultural vector vi. Note that the number of possible states the
system can be in is qFN , which is finite, but very large for even moderate values of the parameters.
Define on this space a metric (or cultural distance) between agents i and j by dij := d(vi, vj) :=∑Fk=1 d
k(vki , vkj )/F , where dk ∈ [0, 1] =: I is the cultural distance between two features (i.e. it is
normalized; infinite distances are not possible). Note that by definition it also holds that dij ∈ I.
In the original Axelrod paper [Axelrod, 1997], the variables were of nominal type, which means
that dk(vki , vkj ) = 1 if vki = vkj and zero otherwise. This is convenient for modeling purposes and
realistic if there is no order in the variables. In this case, the cultural space has no boundaries and
no center. However, if there is order it is more realistic and necessary for empirical research (i.e.
questionnaires) to use ordinal variables, so that the distance between vki and vkj is a function of
|vki − vkj |/(q − 1).
In the original Axelrod model [Axelrod, 1997], it is assumed that the agents are organized on
a two dimensional lattice. At every time step the following actions occur in the indicated order:
(i) An agent i is selected at random; (ii) A neighbor of i, j, is selected at random; (iii) The agents
interact with probability equal to their similarity oij = 1 − dij (homophily); and (iv) Interaction
consists of i copying a random feature vkj of j for k such that vki 6= vki (social influence).
Note that the Axelrod model is a stochastic dynamical system (i.e. stochastic process) that
starts at time t = 0 from initial condition (vi(0))i∈{1,...,N} =: v(0) and its dynamics are specified
by the updating mechanism. More specifically, it is a time-homogeneous Markov Chain that is
absorbing, which means that the system always ends up in an absorbing state, a state in which the
system will remain for all later time. The culture will converge to an absorbing state at an end-
time t = T in which every agent has the same cultural vector (full cultural convergence), to a state
6
with multiple clusters of identical agents with different cluster sizes (partial cultural convergence
or cultural divergence) or to a state with N singleton clusters (full cultural divergence). In the
original Axelrod model, v(0) is generated by sampling each cultural feature from each cultural
vector from a discrete uniform distribution (taking values in [1, q]), which will be referred to as the
random culture. Although some attempts have been made to study the Axelrod model analytically,
using Markov process theory [Lanchier, 2012] or a master equation approach [Castellano et al.,
2000,Vazquez and Redner, 2007,Vilone et al., 2002], these typically rely heavily on approximations
and cannot capture the full extent of the model.
When analyzing the Axelrod model, one usually studies observables that condense the state-
space information. In the original paper [Axelrod, 1997], the author distinguished between cultural
regions, which are groups of adjacent nodes with identical features, and cultural zones, which are
groups of adjacent nodes that have positive overlap in their cultural features (i.e. the possibility
to interact). When talking about a clustering of the set of agents, we say that a condition has to
hold between a pair of nodes (e.g. a cultural distance of zero), but what is really meant is that two
nodes are part of the same cluster if there is a path between the nodes such that each consecutive
pair on this path satisfies this condition. Note that in the case of having a cultural distance of
zero, each pair in the corresponding cluster has this property because it is an equivalence relation,
but in general this is not the case.
It was Axelrod’s observation that cultural convergence/divergence depends on the initial state
of the system and whether the boundaries of the cultural zones dissolve before the process of
homophily and social influence (i.e. the dynamics) settles down. Boundaries between cultural re-
gions within cultural zones tend to dissolve as well, due to the randomness inherent in the updating
mechanism. The initial state (and therefore the dynamics between cultural zones) changes as a
result of parameter changes; Axelrod, therefore, used these concepts to explain cultural conver-
gence/divergence for different parameters. In particular he looked at q, F , number of neighbors
on the lattice and N .
Applying this reasoning to the system, it is clear that when q is large there are more cultural
zones initially and thus more cultural regions in the end-state, hence more cultural diversity.
Similarly, when F is large there are fewer cultural zones initially, so more cultural convergence.
When the number of neighbors is large, there are also fewer cultural zones in the beginning, hence
more convergence. Finally, the effect of system size is most interesting. For small N , there are
not many different cultures to begin with, so the number of domains in the end-state is small.
However, for large N there is another effect that counteracts this one. If the system is large it
takes a long time for it to settle down. Therefore, there are more opportunities for the boundaries
(both regions and zones) to dissolve, resulting in fewer regions in the end-state. The effect of
varying N is therefore non-monotone.
Later, the model was studied extensively by the statistical physics community [Castellano
7
et al., 2009]. [Castellano et al., 2000] studied a (nonequilibrium) phase transition with q as a
control parameter for various values of F . Clearly, the two phases of such a one-dimensional phase
transition are the ordered state (full cultural convergence) and the disordered state (full cultural
divergence). It was observed that the phase transition is continuous for F = 2, but discontinuous
for larger F . In terms of dynamics, it was seen that the number of active links (i.e. pairs of agents
that have 0 < dij < 1) showed non-monotonic behavior, increasing rapidly and then decreasing
again; this effect is especially pronounced around the phase transition. In general, studying such a
phase transition is interesting in itself but looking at the behavior for different values of a parameter
also makes it easy to identify the interesting parameter regions (i.e. around the critical value).
Another study [Guerra et al., 2010] showed that consensus is reached much faster for most single
cultural features than for the entire culture. Presumably, there is monotonicity in most of the
cultural features but not in the culture as a whole. A seperation of time-scales occurs, with a few
bottleneck features.
2.2 Extensions of the Axelrod model
There have been extensions of the Axelrod model, incorporating cultural drift [Klemm et al.,
2003a,Klemm et al., 2005], mass media [Gonzalez-Avella et al., 2005] and the role of dimensionality
[Klemm et al., 2003c]. Cultural drift implies that there is a (small) probability that agents change
the values of their cultural features without social interaction; clearly, this mechanism destabilizes
culturally divergent states. Secondly, introducing mass media means there is a global field that that
influences all agents simultaneously, so that the Axelrod model does not consist of local interactions
only. Finally, Axelrod suggested a 2D lattice as an interaction structure, having a geographical
space in mind [Axelrod, 1997], but when thinking of the interaction structure more as a social
space, it becomes interesting to consider the effect of dimensionality. Some other extensions will
be discussed in detail below, as they will be important for the model discussed in Section 3 .
2.2.1 Networks
Since the state space consists of finitely many agents, any graph (or network) structure can be
imposed on it. Clearly, such a network then represents a social space (i.e. a social network), so
that only agents with social ties can interact. Formally, a graph is an ordered pair (V,E), where
V is the set of nodes {1, ..., N} and E is a set of edges or links, where each link is related to two
nodes (i.e. a link is an object that links two nodes). (Note that in this thesis only unweighted,
undirected graphs with no self-links are considered.) A network can also be represented by an
adjacency matrix A, which is an N by N matrix, with elements aij := A[i, j] = 1 if i 6= j and they
share a link, but zero otherwise. Note that A is symmetrical and has a diagonal of zeros.
Before going further, some network measures are defined (most of the notation and definitions
in this paragraph and the next are from chapter 1 and 3 respectively in [Barrat et al., 2008]). Let
8
ki be the degree (i.e. number of links) of node i. First, the link density L is defined as the total
number of links E divided by the total possible number of links, namely N(N−1)/2. Secondly, the
degree distribution P (k) of a network is defined as as the probability that a randomly picked node
i has degree k. Note that by definition the average degree is 〈k〉 = 2E/N . Thirdly, the clustering
coefficient is defined as
C =1
N
N∑i=1
∑jl aijajlali
ki(ki − 1).
For each node i the number of links between the neighbors of i is divided by the maximum possible
number of such links; the clustering coefficient is the average of this quantity over all nodes.
Finally, the number of connected components is the number of groups of nodes such that there is
no link between any of the groups; a measure of the connected components is the size of the giant
component (or largest connected component) G. When used in this thesis, network measures will
usually be normalized if this is not already true by definition (e.g. G is divided by N such that
G′ := G/N = 1 when the network has only one connected component).
One of the simplest type of networks is a regular lattice, as used in the original Axelrod
model. Another network model is the Random Graph (RG), which will be used lateron and is
generated as follows. Each pair of nodes (i, j) will share a link with probability p and this is done
independently over all pairs. Such a graph has several properties. We have that 〈E〉 = N(N−1)p/2,
so 〈k〉 = (N − 1)p. In addition, when 〈k〉 is larger than one, the graph will have one connected
component, but when 〈k〉 is smaller than one, it will have many small subgraphs. Note that this
result hold for N → ∞ and is approximately true for large N . Furthermore, 〈C〉 = p and its
degree distribution is approximately the Poisson distribution with parameter 〈k〉. This means that
the random graph has a narrow degree distribution (i.e. most nodes have ki ≈ 〈k〉). Graphs that
have a large clustering coefficient are called small-world graphs, while graphs that have a fat-tailed
(i.e. broad) degree distribution are called scale-free graphs. A typical measure of the extent to
which a degree distribution is fat-tailed is the kurtosis (or scaled, centered fourth momenth) of
this distribution. A kurtosis larger than three typically indicates fat tails. Small-world and scale-
free graphs are typically referred to as complex networks and many real networks are complex.
Specifically, most social networks are small-world, although it is not clear whether they are typically
scale-free [Klemm et al., 2003b].
In line with the general interest in (complex) networks from the statistical physics community
[Albert and Barabasi, 2002], there has been some work studying the Axelrod model on RG’s,
small-world networks and scale-free networks [Guerra et al., 2010,Klemm et al., 2003b]. Typically,
it is found that these type of networks facilitate cultural convergence compared to regular lattices
(while keeping the number of links fixed), for distinct reasons. As the amount of randomness is
larger in the case of an RG compared to a lattice, there is a larger probability that traits spread. A
larger clustering coefficient means that networks are better connected locally which will facilitate
9
the Axelrod dynamics (which consists of local interactions); clearly the clustering coefficient should
not be too large, since otherwise different clusters will become distinct cultural regions, enhancing
cultural diversity. Finally, when a network has a fat-tailed degree distribution there are some
nodes, called hubs, with many links that are efficient in the spreading of traits. Note that the
results for the small-world and scale-free networks depend on the specific network models used.
2.2.2 Other extensions
Three other extensions will also be important. First, the principle of Bounded Confidence (BC)
has been used in the Axelrod model [Flache and Macy, 2008, De Sanctis and Galla, 2009]. In the
original Axelrod model it was assumed that agents could interact if they had positive cultural
overlap. However, in reality agents may only interact when the overlap exceeds a threshold θ.
Note that if θ = 0 the original Axelrod model is recovered. Such a parameter may very well
be specific to the individual, so this represents a simplification. However, part of an individual’s
level of trust may be caused by certain macro events, so that this is similar to other agents. In
addition, it appears that BC reduces cultural convergence and makes the model immune to cultural
drift [De Sanctis and Galla, 2009]. Finally, the threshold may be used to define a cultural graph,
as follows: for each pair of nodes i and j, aij = 1 if oij > θ and aij = 0 otherwise [Valori et al.,
2011].
Since the interaction probability depends on the cultural distance between two neighbors and
the confidence level θ only enters the model through this probability, the effect of BC (i.e. a
higher θ) on the Axelrod model is that there are more cultural components (that is, connected
components in the cultural graph) at t = 0 and typically there will be more cultural zones as well
(where the cultural zone is generalized to BC).
Secondly, co-evolution of network and agents (i.e. dynamical networks) has been implemented
in the context of the Axelrod model [Pfau et al., 2013, Centola et al., 2007]. It has been shown
that this mechanism stabilizes the dynamics under cultural drift under some conditions [Centola
et al., 2007], although the updating rule is different from the one that will be used in this thesis.
Thirdly, in the Axelrod model the dynamics start with an initial culture v(0) that is completely
random, but in reality cultures can be more complicated [Valori et al., 2011]. It is possible to run
the Axelrod model on these realistic cultures [Stivala et al., 2014, Valori et al., 2011] or generate
them artificially [Stivala et al., 2014], so-called structured cultures. Using empirical data it is
required to incorporate BC in the Axelrod model and it then becomes interesting to study the
phase transition for varying θ (it is impossible to vary q in the case of empirical data). It is
typically observed that the phase transition in terms of θ is much less steep for realistic cultures
than for random cultural spaces, which is discontinuous (at least for large F , similar to the q-phase
transition when θ = 0).
Initializing the Axelrod model with a realistic culture, one might object that such a culture
10
should be the end-result of the Axelrod model, since a realistic culture is the result of a long term
phenomenon and the Axelrod model is long term as well. If the Axelrod model results in diversity
or convergence, then the resulting culture will most likely not be completely consistent with such
a realistic culture. However, one must realize that in reality there are additional processes such
as a growing population and some additional features (which are not incorporated in the specific
model at hand) that have other effects. Such effects would then explain why an empirically realistic
culture is different from the end-state.
2.3 Social multiplex
In most models of cultural dynamics, agents interact via a specific interaction network (i.e. they are
located on a network and can only interact with neighbors). Such an interaction network should be
a social network since the process of social influence takes place via social ties, which are then the
links. In reality, a social environment can often be subdivided into several distinct social networks
(e.g. work, sports, family), which is termed social multiplexity. Multiple self theory lends support
to the assertion that agents’ behavior is dependent on their social environment [McConnell, 2011].
In addition, using the full (or aggregate) social network when social multiplexity is present, leads
to inaccurate results for dynamic processes [Cozzo et al., 2013].
More generally, there has been an increased interest in the study of multilayer networks, which
are collections of networks where the nodes in each layer are associated to nodes in other layers
[Boccaletti et al., 2014]. An example of this in the context of opinion models is [Quattrociocchi
et al., 2014]. A multiplex network is a special case of a multilayer network, where each node is
only associated to its counterparts in the other layers (i.e. for practical purposes, the nodes are the
same in each layer). A model of cultural transmission has been studied in this setting [Palchykov
et al., 2014].
Formally, a multiplex is represented by a multigraph G, which is an ordered pair (V,E),
where V is as before and E is a multiset of unordered pairs of edges {G1, ..., GM} (i.e. there are
M layers). Again, this can be represented using the adjacency matrix, except that there is an
adjacency matrix Aα for each layer number α. Each cultural agent/actor can now be associated to
a node in this multigraph. It now depends on which set of unordered edges or layer Gα one looks
at, whether two agents have a social tie.
Thus far, we have not assumed any dependence between this multigraph and the cultural space
discussed earlier. With only one social network, it is clear that every cultural feature belongs to
that social space. However, this is not clear in the case of a social multigraph. Therefore, there
needs to be a correspondence between the set of features and the set of layers. That is, it should
be determined for each feature to what layer(s) it belongs. As an example consider an agent’s
preference for beer; intuitively, such a cultural feature should be associated to a social network
regarding friendship relationships. The combination of the culture and the social multigraph will
11
be referred to as the social multiplex.
3 The model
The model that will be used, also referred to as the generalized Axelrod model, has the property
that the social and cultural spaces are not one-to-one. Because of this, some additional notation
has to be introduced. First, let βα be the subset of features that contributes to layer α (from
now on, we write layer α when we mean layer Gα). Second, define a cultural subvector of an
agent i as the collection of traits(vki)k∈β , obtained by restricting to a specific set of features, β.
The dimension of the subspace is |β|. Then, the cultural distance in this subspace, or subcultural
distance, is defined as dβij :=∑k∈β d
k(vki , vkj )/|β|. Finally, the subcultural overlap oβij = 1− dβij .
3.1 Network-culture assortativity
The social network in each layer may be independently determined from the initial culture. Alter-
natively one can let the social networks be generated by the cultural features associated to that
layer; in reality, an agent typically has social ties to agents that are culturally similar [Valori et al.,
2011]. More specifically, the probability of a link between node i and j is pαij = f(oβαij ) for some
increasing function f : I → I. Note that in the last case, the only input to the model (once the
system parameters are specified) is the culture. In the case of multiplex networks this is a very
convenient way of generating the networks; it solves the problem of having to handpick a specific
network for each layer, which would be arbitrary. Furthermore, assortativity is complementary to
the multiplex approach since each layer is associated to a subset of features, so that there is an
assumed relation between the social and cultural spaces.
3.1.1 Properties of the network
If the network is generated using the intitial culture, it is of interest to know what the structure
of this network is. Here, the focus is on one layer, so F corresponds to the dimension of the
subspace associated to that layer. First, assume that the generation process of the initial culture
is stochastic. Then, a priori, the cultural vector of agent i is a random vector Vi that consists of
F random variables V ki . Then we get for the probability of having a link
pij = f
(1
F
F∑k=1
P (V ki = V kj )
). (1)
Note that even if all Vi have the same distribution so that pij is independent of i and j, this does
not constitute a RG, as the probabilities for different links are not independent, hence the joint
probability of all the links together does not factorize. For example, if two nodes both have large
cultural overlap with a third node, they are likely to have a larger than average cultural overlap
12
with each other, so the clustering coefficient of the graph is expected to be higher than that of an
RG.
For collections of Vi that have the same distribution, the argument of f in (1) can be inter-
preted as the average cultural distance between those agents after the culture has been generated.
Intuitively, the reason that the resulting (sub)graph is not a RG is that the cultural distances are
not exactly equal to this average, but fluctuate around it. Assuming certain regularity properties
of the culture generation process it holds that all cultural distances converge to this average as
F →∞. Therefore, if F is large enough, the resulting graph will approximately be a RG. How large
F has to be depends on the specific generation process. Note that the condition that the random
variables V ki are independent over k is sufficient but not necessary for the cultural distances to
converge.
3.2 Generalized Axelrod model
The generalized Axelrod model is similar to the original Axelrod model (which includes the fact
that the cultural features are of nominal type), but there are some modifications. Before any pair
of neighbors is selected randomly, a layer α is selected with probability 1/M . The interaction
probability is modified to rij = oijΘ(oij − θ), where θ is a predetermined threshold and Θ(·) is the
Heaviside function, defined as Θ(x) = 0 if x ≤ 0 and 1 if x > 0 (Bounded Confidence). Furthermore,
interaction consists of the original agent copying one of the features vβαi of his neighbor, in which
the two differ. Finally, an optional feature of the model is that if the interaction is successful, the
original individual updates its links with respect to all other agents in that layer according to the
same rule that generated its links in that layer at t = 0 (see previous subsection). More specifically,
if agent i has a successful interaction in layer α, then agent i’s links in layer α are deleted and with
probability pαij = f(oβαij ) it will have a link with agent j for all j 6= i, where f is the same function
as used in the generation of the multiplex at t = 0. Note that updating of the network according to
this rule essentially implements network assortativity dynamically. Also, if updating is included,
the generalized Axelrod model operates on the entire social multiplex (i.e. the state-variable is the
social multiplex), instead of just the culture.
The updating of the social network makes sense, since the cultural dynamics are long term
and are therefore expected to be on the same time scale as social network formation. Also, the
specific updating rule is intuitive: if a node changes one of its features this is a significant event, so
it makes sense that an agent reevaluates its surroundings in the corresponding social environment.
Note that if a specific feature is associated to multiple layers, then it may be reasonable to update
the links of this node in all corresponding layers, since the agent has changed culturally in all
these layers. However, since the actual interaction took place in a specific layer, it seems that the
dynamic process should only apply to the social environment where the interaction took place.
The principle of homophily operates on the level of the full cultural space and social influence
13
only operates on the level of the social network, so that the generalized Axelrod model extends these
principles to a multiplex context. In contrast, the fact that links are generated (and updated) based
on the subcultural similarity might also be regarded as homophily, but this time it only regards
the cultural subspace. However, in the first case the object of change is the culture, while in the
second case the object of change is the social network.
From the dynamical rules it follows that one feature that causes coupling is the fact that
rij depends on the full cultural distance; if it would only depend on the subspace distance and
each feature would map to only one layer, there would be M independent Axelrod models. In
addition, if one feature maps to multiple layers, this also introduces a natural coupling between
layers (strictly speaking, the object that associates features to layers is then a correspondence since
it would not be well-defined as mapping).
3.3 End-state
In the case of the original Axelrod model, the end-state (i.e. the absorbing state) is a state where
between adjacent nodes i and j the cultural overlap (oij) is either 1 or 0. In the case of BC
this translates into the same thing, but with oij = 1 or oij ≤ θ. Clearly, there can be no more
exchange of cultural features in such a state. However, when there are multiple layers, the notion
of adjacency is different; two nodes can be considered adjacent if aαij = 1 for at least one α. If for i
and j, dβαij = 0, there can be no interaction between them anymore. Therefore, an absorbing state
in the multiplex is characterized by the following: for all pairs of nodes (i, j) that have oij > θ,
there should be no α such that aαij = 1 and dβαij > 0.
4 Simulation set-up
Before delving into the specifics, some general remarks are in order regarding the simulations. First,
the approach to study the Axelrod model is empirically realistic, which has several implications.
First, it makes no sense to change q as if it were a control parameter, like temperature, because
q is a property of the system. The same holds for F . Of course, it is still interesting to look at
different values of q and F , but not at a phase transition with respect to these parameters. Second,
the control paramater θ can be viewed as an inverse measure of trust or confidence in a society;
the larger the value of θ the less confidence there is. It may therefore be more convenient to look
at the control parameter ω = 1− θ, which can be viewed as the (normalized) level of confidence in
a society.
The critical threshold of a phase transition, ωd, can be defined as the ω such that the standard
error or standard deviation of the order parameter averaged over different runs is largest [Klemm
et al., 2003b]. In addition, the Cluster Size Entropy (CSE), which is defined in Subsection 4.2, is
typically largest at ωc, so that this measure defines ωc as well. Usually, the two will agree, but
14
when they do not, emphasis is on the CSE.
It is also more interesting to look at the order parameter as a function of confidence ω for
other reasons. Changing the value of ω form 0 to 1 has the property that when ω = 0, there will
always be full cultural divergence. Furthermore, if q is such that for the normal Axelrod model
(i.e. ω = 1) there is cultural convergence, changing ω from 0 to 1 means going from cultural
diversity to cultural convergence. Note that for small q this is usually the case and that small q
corresponds to a realistic cultural system. Also note that q is the main factor in determining the
network topology in the case of network assortativity as is explained in Subsection 3.1.1, so that
varying q would have a double effect on the model, which is undesirable.
Similarly, most realistic systems have large F . Therefore, we will generally look at systems
with small q, large F and changing ω. As the threshold is always compared to the cultural overlap
and there are only F + 1 possible values for the overlap, there are only F relevant values for ω to
study. The collection of disjoint sets for which each ω is equivalent is [0, 1/F ], (1/F, 2/F ], ..., ((F −
2)/F, (F − 1)/F ], ((F − 1)/F, 1]. Each set is referred to as an equivalence class and an element
from such a class is a representative. We will typically use the midpoint value as a representative,
rounded to two decimal places.
A small number of agents (N = 100) is used, since the complexity of the model means running
times are long. Furthermore, the correspondence between features and layers will be a symmetric
mapping, which means that if there are M layers, there are M ∗Z features for some positive integer
Z and Z features map to each layer. In terms of the cultural parameters, we take (F, q) = (36, 6)
for all simulations. The value of F is convenient since it is large enough to be empirically realistic,
has nice numerical properties since there are 9 pairs of (M,Z) that multiply to make 36 and it
is not too large (which would cause running times to be even longer). The value of q is chosen
to be small, so that for ω = 1, the model reaches cultural convergence and a phase transition
exists. Finally, note that some of the simulations have also been done using other, similar values of
(N,F, q) but the results were not qualitatively different from the results obtained using the original
parameter values.
4.1 Routes of investigation
Even when (N,F, q) is fixed, in addition to the mapping from features to layers, there are still
many degrees of freedom for which we can study the phase transition in ω. First, one can vary
the number of layers. Second, the function f that determines the connection probability has not
been determined. Third, the network can be static or dynamic. Finally, the initial culture can be
generated in multiple ways.
As increasing the number of layers increases the multiplexity of the system, this lies at the
heart of the investigation and will be done in every treatment. In terms of the remaining degrees
of freedom, we distinguish between the following treatments per degree of freedom: f(oβαij ) = p,
15
where p ∈ [0, 1], versus f(oβαij ) = oβαij ; random culture versus structured culture; and no updating
networks versus updating networks. This would give rise to eight treatments in total, but only
four of these are investigated. The simplest treatment is the one where the networks are generated
by the RG algorithm, the culture is random and the network does not update. In the second
treatment we do the same except we add assortativity. In the third treatment, we do the same
as in the second except that networks are allowed to update. Finally, the last treatment is the
opposite of the first. In this way, the complexity increases at every step.
With respect to the outcome of the general Axelrod model, several observation have to be
made. First, the system can be studied by looking only at the end-state or by explicitly looking at
the dynamics, both of which will be done and are explained in the next two subsections. Second,
the dimensionality of the system is so high that the only way to study it is to study clusterings
(e.g. cultural regions) and, more specifically, the vector of cluster sizes. Indeed, it is not important
what values the specific traits in each cultural vector in each cluster have, nor which vector is in
which cluster, since these details provide no information regarding cultural convergence/divergence.
Next, it will be discussed what type of clusterings will be used (both in the end-state and dynamical
analysis).
4.1.1 Clusterings of the culture: global and layer dependent
When analyzing the generalized Axelrod model, some clusterings give important information
regarding the structure of the culture. On the global level (i.e. the level of the social multiplex)
it is not convenient to take into account which agents are connected to each other since this
information is encoded at the layer level. Below, both global and layer dependent clusterings are
defined for am arbitrary pair (i, j). (Note again that when we say a property has to hold between
pairs of nodes, we mean that between each pair of nodes in the cluster there is a path such that
each consecutive pair on the path has the property.)
Global clusterings
• cultural domain: dij = 0. The cultural domain could theoretically consist of multiple
collections of nodes that are not linked within any of the social networks. However, for most
parameter values this event is extremely unlikely (especially if there is network updating)
• cultural component: dij < ω. The cultural component can be seen as clusters of nodes that
have the possibility (if there would be a link in the appropriate layer) to interact and converge
culturally; with network updating, such an interaction will most likely become possible over
time in any of the layers. Note that the cultural component really is a connected component
with respect to the cultural graph
16
Layer dependent clusterings
• cultural region α: dβαij = 0 and aαij = 1. This is a direct analogue of the cultural region
defined for the original Axelrod model
• cultural zone α: dij < ω and aαij = 1. This is a direct analogue of the cultural zone defined
for the original Axelrod model; note that the cultural distance should depend on the full
cultural space, since interaction depends on the full space. When the number of cultural
regions equals the number of cultural zones in all layers, the system will be in its end-state,
since no more dynamics can take place
It is important to note that each cultural component will always encapsulate at least one cultural
zone in a specific layer, since cultural zones have a stronger requirement. The extra requirement
aαij = 1 can only reduce the number of cultural zones per cultural component (if there was no such
requirement there would trivially be one cultural zone per cultural component, since the two would
be the same). One implication that follows from this is that there will always be at least as many
cultural zones as cultural components in total.
4.1.2 Random initial culture
In this subsubsection some aspects of the random initial culture with regards to the network
structure are discussed in more detail; the next discusses the structured culture used in the last
treatment. In the case of the random culture, a priori each cultural vector Vi is a vector of F
independent discretely uniformly distributed random variables with domain [1, q]. Then, using the
results from 3.1.1, we have that the probability of having a link is pij = f(1/q) = 1/q. Therefore,
the resulting graph will have a relatively simple structure. However, it is not clear what F will be
large enough for the network to be like an RG.
We simulated the network generation process for many values of the parameters (N,F, q) (note
that here F is the dimension of the subspace associated to a layer, which means that F = Z).
In general, the resulting graph has properties (i.e. link density, degree distribution, clustering
coefficient, connected component) that closely resemble those of the RG when Z is large. When Z
is small the resulting graph has a large clustering coefficient. Both observations are in agreement
with the discussion in 3.1.1. For (N,Z, q) = (100, 36, 6), the properties already closely resemble
that of an RG. Note, however, that as the M increases Z decreases, so the F corresponding to each
layer becomes smaller. When there are many layers, Z is small, so the networks associated to each
layer will have a relatively large clustering coefficient. Therefore, when assortativity is present, M
has two effects on the dynamics: it influences the Axelrod dynamics and it effects the underlying
topology of the network. Also, it was found that when Z > 1 (N = 100, q = 6) the graph is
connected (i.e. G = 1). For Z = 1, this is not the case, but this corresponds to M = 36 and as
we show later this case shows trivial dynamical behavior when there is assortativity. Finally, N
17
should be large relative to q to obtain a connected graph, which is similar to the RG case, where
Np should be much larger than 1 to obtain the result.
4.1.3 Structured initial culture
The type of structured culture used in this thesis is based on the prototype evolution algorithm
in [Stivala et al., 2014], which initializes the culture by generating the cultural vectors around a few
fundamental cultural vectors, called prototypes. This algorithm is inspired by theories from social
science which postulate that most individuals fall in a certain cultural category and the prototypes
are then the most typical members of such a category (note that a prototype does not need to be
an actual individual).
Specifically, in each layer two prototypes are generated by the requirement that they both
have f cultural traits in common with a third superprototype (which is just a randomly generated
cultural vector), while the remaining traits are generated randomly. Around each prototype N/2
agents are generated by letting each agent have g traits in common with the prototype while
the rest is generated randomly. Letting b = g/Z , they will on average have an overlap with the
prototype of o = b+(1−b)/q. Therefore, this algorithm will produce two spherical shells of average
radius r′ = 1− o (while the radius of the outer sphere is r = 1− b). Note that the prototypes are
not included in the initial culture.
To compute the distance between the two prototypes (i.e. between the centers of the shells),
R, let first B = f/Z. For each cultural feature three things can happen: both prototypes have
cultural traits obtained from the superprototype (so they are equal to each other); only one of the
prototypes has a trait obtained from the superprototype; and both prototypes do not have traits
obtained from the superprototype. Properly accounting for all the probabilities this means that
for each cultural trait separately the probability that the two prototypes have equal trait is
O = B2 + 2B(1−B)/q + (1−B)2/q = B2 + (1−B2)/q. (2)
Averaging over all cultural features gives that the average overlap between the prototypes is exactly
the same expression, again denoted by O. If this is used as an estimate for the actual overlap it
should be taken into account that it is the average overlap and especially for small Z there will be
large fluctuations. For large Z the difference will be small; simulations confirm this. Since in our
case Z will typically be large, this is a good approximation. The distance between the spherical
shells will then (on average) be R = 1 − O. In addition, it should be noted that if B = 1, only
one spherical shell will be created with N agents, since the prototypes are the same in that case.
On the other hand, if b = 0, then the algorithm is the same as the algorithm that generates the
random culture. However, even though it is a special case, the random culture algorithm can in
principle generate any of the structured cultures (typically, however, the probability of generating
one with realistic values of B and b is very small if F , q and N are non-trivial).
18
An expression similar to (2) can also be derived for the average distance between two subcul-
tural vectors generated around the same prototype, by just replacing B by b in that equation. This
can give important information on the network structure of the nodes around each shell. Using the
results from 3.1.1 it follows that pij = b+ (1− b)/q = 1/q + (1− 1/q)b (note that this expression
reduces to the one for a random culture if b = 0, as expected).
For the following, some notation has to be introduced. Since the above discussion was about a
single layer, the spherical shell (or shell) for one layer will from now on be referred to as a subshell.
Denote by Gαi subshell i in layer α, where i ∈ {1, 2}. The prototype around which this subshell
is generated will then be referred to as the subprototype. In addition, the subcultural graph in
layer α is defined as the cultural graph with respect to the corresponding subspace (i.e. with dij
replaced by dβαij ).
Now, the question becomes how to select f and g. The most diverse behavior occurs when
for some values of ω, there are two subcultural components (that is, if there was only one layer no
interaction would occur between agents from the two subshells). In addition, the radius r should
not be too small, since otherwise there would be trivial dynamical behavior in each subshell (in
the most extreme scenario both subshells would consist of N/2 identical subprototypes). Finally,
the average distance between subcultural vectors in each subshell should be equal to the distance
between the subshells themselves. This should be true by construction, except for the fact that
there is a maximal distance in the cultural space, namely 1.
These three requirements are surely fulfilled if the following conditions are satisfied:
R ≥ 4r + δ; r ≥ ε; R+ 2r ≤ 1, (3)
where δ controls how many threshold values should exhibit the required behavior outlined in the
previous paragraph and ε determines how non-trivial the dynamics within the subshells will be.
The parameter r is set directly by g, while R is set randomly by f (since R is the average distance
between the prototypes), so some margin should be included in setting R. In practice (i.e. for
significant values of δ and ε and with Z ≤ 36), the conditions in (3) yield no feasible pair (f, g),
even if one replaces r by r′. However, these conditions are based on extremely small probabilities.
For example, the probability that two cultural subvectors are generated in the two subshells that
have the smallest distance possible (i.e. ‘lie on the line between the subprototypes’) and also
have the highest distance possible to any other agent within their respective shells, is extremely
small. Therefore some simulations were performed that computed the following quantities for each
generated structured culture:
dmin = mini∈Gα1 ,j∈Gα2
dβαij ; dmax = maxi∈Gα1 ,j∈Gα2
dβαij ; dkmin = mini∈Gαk ,j∈G
αk
dβαij
and it should at least hold that dmin ≥ dkmin + δ for k = 1, 2 and dmax ≤ 1. Based on the results
of these simulations, many combinations of (f, g) were feasible most of the time (recall that the
19
generation process is stochastic). Some additional constraints are due to the fact that there are
multiple layers, which is discussed next.
In the case of a multiplex, there are cultural subspaces associated to each layer, so it becomes
necessary to match the cultural subvectors in each layer. The straightforward way to do this is to
have a mapping between subprototypes accros layers. In effect one then has multiple prototypes,
each composed of different subprototypes and the collection of agents associated to each prototype
depends on this mapping. In general there will be 2M such prototypes. For more than two layers,
the situation becomes increasingly complex, so to keep things simple we will only use M = 1 and
M = 2 for the treatment with realitic cultures. When M = 1, there are simply two subprototypes,
which are also the prototypes.
Now, consider the case M = 2. In one extreme case, subprototype 1 in layer 1 would be
matched to subprototype 1 in layer 2 (i.e. the subcultural vectors that are generated around
prototype 1 in layer one are matched to those generated around prototype 1 in layer 2) and
subprototype 2 in layer 1 is matched to subprototype 2 in layer 2 (or vice versa), which corresponds
to no mixing. In the other extreme case, one will have that only half of the subcultural vectors
generated around subprototype 1 in layer 1 will correspond to the subcultural vectors around
subprototype 1 in layer 2, while the other half will correspond to the subcultural vectors around
subprototype 2 layer 2 (and similar for the other half of the subcultural vectors associated to the
subprototypes).
Letting Πij be the prototype with subprototype i in layer 1 and j in layer 2, this scenario
would divide the cultural vectors in four shells (cultural vectors around Π11, Π12, Π21 and Π22
respectively) of size N/4, and corresponds to the case of full mixing. Intermediate cases would also
divide the cultural vectors in these classes, but then those of Π11 and Π22 (non-mixed prototypes)
have size N(2− µ)/4, while the others (mixed prototypes) have size Nµ/4, where µ ∈ [0, 1] is the
mixing coefficient; if µ = 0, there is no mixing, while µ = 1 corresponds to full mixing. Denote
by Gij the shell associated to Πij , while Gαij is used for the same shell when restricting to the
subcultural vectors associated to layer α only. In the sequel, G11 and G22 will be referred to as
non-mixed shells, while G12 and G12 will be called mixed shells, for obvious reasons. Finally, note
that even though positive µ could consistently be used with M = 1, this has no relevance, so in
the case M = 1, it always holds that µ = 0.
Since the values of f and g should be the same for M = 1 and M = 2 and because, if M = 2,
both layers should have subshells of equal radius, f and g have to be even numbers. In addition,
to allow for all possible behavior, it is desirable that R is such that even shells that have one
subprototype in common have some distance between them. Using the results of the simulations,
it turns out that the combination (f, g) = (0, 28) satisfies all the requirements with some margin.
These parameter values will be used for generating the structured culture in Subsection 5.5.
20
4.2 End-state analysis
For the end-state analysis the only quantity that will be investigated is the collection of domain-
sizes. To compress this information into one number that also has the property of being an
order parameter (i.e. one value before the phase transition, another after the phase transition),
we typically use the normalized number of domains, which is denoted by ND. Another order
parameter is the normalized size of the largest cluster, denoted by SmaxD . Moreover, a quantity that
is typically only non-zero at the phase-transition and gives more information on the distribution
of cluster sizes is the CSE. Finally, the number of time-steps needed to achieve convergence (T )
will also be studied.
As the Axelrod model is stochastic, one can reliably study its dynamics only by employing
many runs. The number of runs (K) will be 100 for each parameter constant. The resulting
quantities will then be averages (e.g. 〈ND〉) and in this thesis the average always implies the
average over multiple runs, unless stated otherwise. In addition, the standard deviation of the
normalized number of domains SDD =√〈(ND − 〈ND〉)2〉 and the corresponding standard error
SED = SDD/√K will be computed. (For SmaxD , the standard deviation is denoted as SDmax
D and
similar for the standard error.) We look at a phase transitions in terms of ω for various values of
M . Effectively, therefore, we look at a two-dimensional phase-transition. However, we are mostly
interested in the difference between having 1 layer (M = 1 or singleplex) and having multiple layers
(M > 1 pr multiplex).
4.2.1 Cluster size entropy
For the cultural domains in the end-state, D, the number of domains ND and the largest domain
SmaxD compress a full set of cluster sizes into one number. Outside of the phase transition, this
is usually a trivial compression, but around the phase-transition, much information is lost. For a
specific clustering, the CSE is the weighted entropy over the the distribution of cluster sizes, that
is
CSE = −∑s
Ws logWs,
where Ws is the probability that an element (agent) belongs to a cluster of size s [Gandica et al.,
2011]. The CSE has a value of zero when only one type of cluster size is present and the more
cluster sizes are present, the higher it becomes; at the phase transition, when there usually is some
degree of scale invariance, it reaches its maximum (i.e. the distribution over cluster sizes is then
closest to being uniform). It is weighted, since the size of the cluster is taken integrated into the
probabilities. Otherwise, it would not be a useful measure, since e.g. a clustering of two clusters,
where one is singleton and the other comprises the rest, would give rise to an entropy of log 2,
while it is supposed to give a very low entropy.
21
This measure can be normalized as follows. The largest possible value of the entropy would occur
when all Ws have the same value and when there are as many cluster sizes as possible. These two
requirements are constrained by the fact that the total number of agents is N and that clusters are
discrete objects (i.e. half a cluster is not possible). From this it follows that√N is an upperbound
for the maximum entropy. For example, if N = 100 a maximal entropy is obtained by having 10
clusters of 1, 5 clusters of 2, 2 clusters of 5 and 1 cluster of 10, but the remaining 60 agents cannot
be evenly distributed over the remaining 6 cluster sizes, so there is less entropy compared to the
case where agents are evenly distributed over the cluster sizes.
4.3 Dynamical analysis
When studying the system dynamically, it makes sense to study the ‘most interesting’ case. Since
each system is investigated for different ω we therefore choose ω’s close to ωc. Also note that it
would not make sense to sample each time step, since too much data would be obtained. Therefore,
only once every Y time-steps a measurement is taken from the system. Unless otherwise stated,
Y = N ; this is equivalent to one (attempted) update per agent on average. Just a few runs are
observed for each paramater-set; no averages over runs will be performed for the dynamical case.
For the purpose of studying the Axelrod dynamics many observables can be computed over
time. All the observables discussed at the beginning of this section, such as the cultural compo-
nents, will be shown (that is, the number of clusters in each case); in addition, to look at the
difference between the layer dependent measures, the variation of information (to be explained
later in this subsection) is computed between both the cultural zones and cultural regions for a
pair of layers. Denote the clustering of cultural components by D(ω), cultural regions in layer i
by Di and cultural zones in layer i by Di(ω), where the dependence on ω is used to indicate the
explicit dependence of the measure on ω (note, however, that the other measures also depend on ω
indirectly since the Axelrod dynamics depends on it). The normalized number of clusters is then
denoted by NX for a clustering X. Finally, if X is a clustering, denote by X[n] its nth cluster.
Since the number of observables grows fast when the number of layers increases, only the cases
M = 1 and M = 2 are investigated. Presumably, many of the insights in the dynamical behavior
that are obtained by analyzing just two layers can also be applied to more than two layers.
4.3.1 Network observables
The Axelrod model could also have an effect on the social multiplex if the network updates over
time. Correlations may develop between the layers as a corollary to the Axelrod dynamics. This
will be investigated dynamically by measuring the correlation every Y time-steps. The correlation
between two unweighted, undirected networks can simply be computed as the the correlation
between the corresponding adjacency lists (i.e. the extent to which a link between node i and j
is present in layer 1 is matched by the same link in layer 2). Formally, the correlation coefficient
22
between layer α and γ is
ρα,γ =〈aαija
γij〉 − 〈aαij〉〈a
γij〉√
(1− 〈aαij〉)〈aαij〉√
(1− 〈aγij〉)〈aγij〉,
where it should be noted that 〈aij〉 = 〈a2ij〉, since aij is a binary variable. The normalized correla-
tion is then obtained by dividing the correlation by two and adding 0.5.
From the beginning we have assumed that the different layers represent distinct social net-
works. It is therefore interesting to investigate whether the layers actually have the properties of
social networks. It was already shown that for some configurations, the initial layer looks like a
RG, so that this does not resemble a realistic social network. A structured initial culture may re-
sult in layers that show properties of social networks like the small world and scale-free properties.
To study this, the size of the largest connected component G, the link density L, the clustering
coefficient C and the kurtosis of the degree distribution κ (as defined in Subsubsection 2.2.1) are
computed for each layer every Y time-steps. In general, if a network measure X corresponds to
layer α, this is denoted as Xα. Clearly, when the network does not update the network properties
stay the same and since they do not vary with the threshold they are the same for all runs.
4.3.2 Variation of information
A measure of discrepancy between two clusterings is the Variation of Information (VI) [Meila,
2003]. If A is a set and X = {X1, ..., Xk} and Y = {Y1, ..., Yl} are such that Xi ∩ Xj = ∅ for
all i, j and ∪ki=1Xi = A (similarly for Y ), then X and Y are clusterings (or partitions) of A. In
addition, let N = |A| and let pi = |Xi|/N , while qj = |Yj |/N . Note that pi is the probability that
a randomly picked element of A is in Xi (similarly for qj). The VI between X and Y can then be
defined as
V I(X;Y ) = H(X) +H(Y )− 2I(X,Y ),
where H(X) is the entropy of X, defined by
H(X) = −k∑i=1
pi log(pi)
(similarly for H(Y )) and I(X,Y ) is the mutual information, defined by
I(X,Y ) =
k∑i=1
l∑j=1
rij log
(rijpiqj
),
where rij = |Xi ∩Yj |/N is the joint probability of randomly selecting an element in A that is both
in Xi and Yj . It is easily seen that if the clusterings are the same, rii = pi = qi (and rij = 0
for i 6= j), so that I(X,Y ) = H(X) = H(Y ), which implies V I(X;Y ) = 0. Similarly, if the two
clusterings are completely independent, then I(X,Y ) = 0, so V I(X;Y ) = H(X) + H(Y ). The
23
V I can be conveniently normalized by dividing by log(N), since this is the maximum value that
H(X) or H(Y ) can have, which will be done in the sequel. Note that this measure can then only
be compared for systems that have the same N .
It is always the case that cultural zones Di(ω) have partly the same structure since all Di(ω)
are a refinement of D(ω), as was explained in Subsubsection 4.1.1. Here, this means that when
comparing two layers, the matrix with entries rij consists of blocks on the diagonal and is zero
everywhere else. To get a consistent measure of the variation of information, one has to take this
into account. One way to do this, is to compute for each cultural component, the (normalized) VI
seperately and then compute the weighted average over all cultural components, where the weight
is the relative size of the component. The resulting measure is normalized and called the Weighted
VI (WVI).
5 Results and discussion
As was outlined in Subsection 4.1, this simulation study focuses on four treatments. The results of
these will be shown and discussed in Subsections 5.2, 5.3, 5.4 and 5.5. Note that Subsection 5.2 is
fundamental, while the next two subsections build on the results presented there; Subsection 5.5
uses the results obtained in the previous subsections but also constitutes a different approach to
studying multiplexity and therefore differs somewhat from the rest. Table 1 presents an overview
of the four treatments. Before the actual treatments are discussed, Subsection 5.1 will go into a
trivial version of the model that is discussed in treatment 1 to show some differences compared
to the singleplex that already arise from the generalized model structure itself. Finally, it was
observed that all of the networks are connected at all times t, so G = 1 for any case we have
considered; it will not be shown each time in the results below.
Treatment Assortativity Updating Culture
1 no no random
2 yes no random
3 yes yes random
4 yes yes structured
Table 1: Characteristics of the four treatments
5.1 Treatment 0: trivial multiplex
In this subsection a useful intermediate case between a singleplex and a multiplex is discussed,
namely a multiplex that has the same graph in each layer (i.e. a trivial multiplex). The main
24
result is that there is more cultural convergence for a trivial multiplex compared to a singleplex,
due to the compartmentalization of the generalized Axelrod model.
5.1.1 End-state results
In Figure 1 results are shown for a singleplex and two multiplices (M = 2 and M = 36) that have
the same RG graph in each layer (i.e. trivial multiplices). Note that it is not possible to include
assortativity in the trivial multiplex condition, since the networks would then be generated by the
cultural subspaces and will typically be different from each other (the same holds, of course, when
networks update).
Clearly, there are differences between the three cases. First, the trivial multiplex condition
shows more convergence (i.e. lower values of 〈ND〉) for all ω than the singleplex, although the
difference is small for M = 2. Second, it is clearly the case that 〈T 〉 is larger for increasing M ,
especially when M = 36.
0
0.2
0.4
0.6
0.8
1
0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74
< N
D >
ω
12
36
0
200000
400000
600000
800000
1e+06
1.2e+06
1.4e+06
1.6e+06
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9
<T
>
ω
12
36
Figure 1: Average number of domains 〈ND〉 (left) and average number of time steps 〈T 〉 (right) in
the end-state as a function of ω for M = 1, M = 2 and M = 36 (treatment 0)
5.1.2 Discussion
A trivial multiplex implies that if two agents are connected in one layer they are connected in
every layer and vice versa. If an interaction is successful in a singleplex, the active agent randomly
chooses a feature from the set of all features in which the two differ. However, when such an
interaction occurs in a trivial multiplex, the distribution over features that differ is not the same,
since the probability to pick any such feature is 1/M times the probability of randomly selecting
that feature out of the list of features for which the two differ in the cultural subspace associated
to that layer.
In addition, if for some pair (i, j) layer α is selected, it may even occur that dβαij = 0, so that the
probability of selecting a feature for which the two differ is not just different, it is zero. Therefore,
if a pair of nodes has the same cultural overlap in the two cases, their propensity to interact is the
25
same, but if the pair has a layer with subcultural distance equal to zero (in the case of the trivial
multiplex), then there is a large probability that, even though the interaction would have been
successful, there can be no cultural influence. However, in a singleplex such an interaction would
always occur, unless the agents are culturally identical (i.e. dij = 0), in which case there is no need
for further interaction anyway. This feature introduces a kind of compartmentalization in the
generalized Axelrod model, which is not present in the ordinary Axelrod model on a singleplex.
Globally, this means that in the trivial multiplex case it will typically happen that cultural
zones (the cultural zones are the same in both layers since the network is the same) form with
some pairs of agents (i, j) that have dβαij = 0 for some α. Cultural convergence within such zones
will then take longer on account of these pairs, increasing the possibility that the cultural zone
merges with another cultural zone. The more culturally similar the zones become, the more likely
the existence of such pairs is. Because there is a larger probability that cultural zones merge, there
is more cultural convergence by arguments similar to those in [Axelrod, 1997] (see Subsection 2.1),
even without having different networks in different layers.
The more layers there are, the larger the probability of encountering a pair of nodes (i, j) in
layer α that has dβαij = 0 while dij > 0, since Z is smaller. It seems therefore, that compartmen-
talization causes cultural convergence and is associated to longer running times, as was observed
in the simulation results.
5.2 Treatment 1: random culture and random, static networks
In this section, the simulation results with respect to the first treatment are discussed. To be
consistent with the result in 4.1.2, for the RG we set p = 1/q. It will be shown that multiplexity
typically leads to more cultural convergence, but this originates from multiple effects, some of
which counteract cultural convergence. The most important mechanism is that the cultural zones,
by not overlapping perfectly between different layers, interact indirectly to produce more cultural
convergence, while increasing the time to reach the end-state.
5.2.1 End-state results
In Figures 2 and 3, the end-state results are shown (i.e. the observables for each M). Note that,
according to both CSED and SED the phase transition is at ωc = 0.68, although for M = 1 SED
is large at ω = 0.71 as well, while for M = 36 the same holds with respect to ω = 0.65. There
seems to be a clear hierarchy at ωc, where the different scenarios are ordered almost perfectly
according to 〈ND〉 (i.e. if M > M ′, then 〈ND〉 < 〈N ′D〉); the only exception is the pair (9, 12).
Note that these are averages and since the standard error is in the range 0.01− 0.025 the ordering
result is not statistically significant for large M , especially since the large standard errors occur at
large M . Most likely, the differences between the largest values of M are small, so that the number
of runs K should be even bigger to establish a statistically significant difference. Nonetheless, it
26
is clear that multiplexity increases cultural convergence at ωc. This also seems to hold to some
extent at ω = 0.71. For, ω = 0.65, however, the singleplex shows more cultural convergence than
the multiplex cases. Together, these observations imply that the phase transition is steeper for the
multiplex cases.
Furthermore, it is clear that when M is larger, 〈T 〉 is larger as well; again, the cases are
almost perfectly ordered in 〈T 〉. 〈T 〉 seems to be largest at ωc for most M , but the singleplex case
is the only exception with ω = 0.71, which is consistent with the fact that SED was large for this
value as well. It is not surprising that T is largest at the critical threshold of a phase transition;
as the system could go either way, it needs a long time to ‘decide’. Furthermore, it is intuitive
that if M is larger, then 〈T 〉 is larger, since there will be many more interactions, mainly because
of compartmentalization. Also, in all cases 〈T 〉 is much larger on the right side of the critical
threshold and jumps to its peak at this threshold.
0
0.2
0.4
0.6
0.8
1
0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74
<N
D>
ω
123469
121836
0
0.005
0.01
0.015
0.02
0.025
0.03
0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74
SE
D
ω
123469
121836
Figure 2: Average number of domains 〈ND〉 (left) and associated standard error SED (right) in
the end-state as a function of ω for multiple values of M (treatment 1)
0
0.1
0.2
0.3
0.4
0.5
0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74
<C
SE
D>
ω
123469
121836
0
500000
1e+06
1.5e+06
2e+06
2.5e+06
3e+06
3.5e+06
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9
<T
>
ω
123469
121836
Figure 3: Average cluster size entropy 〈CSED〉 (left) and average number of time steps 〈ND〉
(right) in the end-state as a function of ω for multiple values of M (treatment 1)
27
Finally, the CSED values are also ordered according to M , where the singleplex has the largest
value. Recall from Subsubsection 4.2.1 that the CSE is largest when the distribution of cluster
sizes is as uniform as possible; the cluster sizes of the singleplex are more uniformly distributed
than those of the multiplex cases. Moreover, note that in general the range of ω for which there is
any difference between the various values of M is small, since the phase transitions are steep.
5.2.2 Dynamical results
The difference between the singleplex and multiplex is largest (in terms of 〈ND〉) at the critical
threshold. However, since for M = 2 not much cultural convergence has been achieved at this value,
the mechanisms just described manifest themselves better for ω = 0.71. Note that although differ-
ent runs typically give different results (especially around the phase transition), some qualitative
features are the same in all runs. In Figure 4, a plot is shown of a typical run when ω = 0.71.
The following observations can be made regarding the overlap between zones. ND1(ω) is
roughly equal to ND2(ω) after a long period of large VOI between the two and they are close to
ND(ω). In addition, WV I(D1(ω), D2(ω)) is small at t = T , so that the cultural zones match well
in the end-state.
0
0.2
0.4
0.6
0.8
1
0 1000 2000 3000 4000 5000
Obs
erva
ble
t (x100)VI(D1;D2)
WVI(D1(ω);D2(ω))ND
ND1
ND2ND(ω)ND1(ω)ND2(ω)
Figure 4: variation of information between cultural regions V I(D1, D2), weighted variation of
information between cultural zones WV I(D1(ω), D2(ω)), number of cultural domains ND, cultural
regions (1) ND1 , cultural regions (2) ND2 , cultural components ND(ω), cultural zones (1) ND1(ω)
and cultural zones (2) ND2(ω) on a typical run for (M,ω) = (2, 0.71) as a function of t (treatment
1)
28
Moreover, it is observed that ND(ω) typically does not decrease that much over the run of the model,
so that there is little decrease in cultural components. ND(ω) fluctuates due to randomness;
however, fluctuations are only robust if they coincide with fluctuations in the number of cultural
zones since only then can the fluctuation be sustained by interaction.
Finally, ND, ND1and ND2
are large for a long time, before they decrease. This is similar
to the results in [Axelrod, 1997]. In addition, V I(D1, D2) also becomes small at t = T (note
that the cultural regions and zones are the same at t = T by construction, but the variation of
information measures are different so they give different results). There does not seem to be much
non-monotonicity in the convergence of the number of cultural regions layers. More specifically, it
could have happened that one layer converged to its end-state (freezes), while the other would still
exhibit dynamics, which causes the former layer to become active again (unfreezes). In the graph,
this would translate into NDi(ω) = NDi for some time and then NDi(ω) 6= NDi , but this is clearly
not the case.
5.2.3 Discussion
Below some mechanisms will be discussed that may underlie the observed behavior in the results
discussed above. First, for simplicity, consider the case with only two layers. At t = 0, it will
typically be the case that there are multiple cultural components and within each cultural com-
ponent there are multiple cultural zones, which are different in each layer. As was already noted
in Subsection 2.1, boundaries between cultural regions within a cultural zone tend to dissolve over
time, just like boundaries between cultural zones. The cultural zones are defined such that when
the number of cultural zones is equal to the number of cultural regions, the dynamics in that
specific layer stop; this can be considered the temporary end-state for that layer (only interactions
in the other layer may change this). This end-state roughly depends on two things: the number of
cultural zones at t = 0 (the number of cultural regions in the end-state will typically be at most
equal to this) and the extent to which the boundaries between cultural zones tend to dissolve over
time, as was noted in [Axelrod, 1997].
The boundaries between zones can dissolve in two ways, namely within a cultural component
and between cultural zones in different cultural components. In both cases, there are pairs of
agents that have a link but do not yet have enough cultural overlap (recall that within a cultural
component not all pairs (i, j) have dij < ω). Multiplexity will increase the likelihood of both
mechanisms in different ways.
First, it makes it easier for boundaries to dissolve within cultural components, since cultural
zones within a cultural component do not match perfectly between different layers, so two zones
are ‘linked’ indirectly in one layer if some of the corresponding agents are linked in another. More
specifically, consider two nodes i and j that have a1ij = 1, a2ij = 0, dij < ω, i, j ∈ D1(ω)[m] and
i ∈ D2(ω)[n], but j ∈ D2(ω)[n′] with n 6= n′. Then they can interact and grow closer only in layer
29
1. However, suppose that as an indirect result of this interaction, another agent k will be brought
in the position of satisfying k ∈ D1(ω)[m], k ∈ D2(ω)[n], a2jk = 1 and djk < ω (or, equivalently,
replace i by j and n by n′), then the boundary between the cultural zones D2(ω)[n] and D2(ω)[n′]
has dissolved. Note that the existence of such an agent is plausible, since the cultural convergence
between i and j may cause k, which was already close to i to become close to j as well, or vice
versa. Typically, there are many candidate k’s in such a cultural zone (the only real requirement
being that it has a link to either i or j in layer 2). This overlap mechanism was illustrated in the
dynamical results above: the number of cultural zones in both layers will be close to the number
of cultural components at t = T since within components overlap between layers tend to result in
the formation of just one cultural zone per cultural component.
The second way will also be easier with multiplexity: since there are now two layers, the
probability that in at least one of these a boundary is dissolved between cultural components is
larger than in the single layer case. In an extreme case, if one layer freezes, the other may still
exhibit dynamics and this in turn can cause the first layer to unfreeze again. From the dynamical
results it follows that this decrease in cultural components is not dominant. This makes sense
since pairs of agents within a cultural component typically have larger cultural overlap initially,
compared to agents in different cultural components, so less interaction has to occur before they
are similar enough to cross the threshold.
Due to the compartmentalization of the generalized Axelrod model, the dynamics take
longer (as was explained in Subsection 5.1). Therefore, convergence is higher because of this as
well. Note, however, that this effect is small, since for M = 2 only 1/q2 of all pairs have links in
both layers and not all of these pairs will have significant cultural overlap. Typically, these three
mechanisms will be associated with longer running times T and more cultural convergence. This
is clearly consistent with the results from the end-state which indicate that T is larger when the
number of layers is larger, T distinguishes clearly between one layer and multiple layers and there
is typically more cultural convergence when M increases.
There is also a force against convergence, namely the fact that if in the end-state two cultural
region clusterings are attained that are non-trivial (in the sense that there are multiple nontrivial
clusters), the cultural domain will roughly be the ‘intersection’ of the two (the probability of
two separate cultural regions with identical cultures is negligible), which will typically result in
more domains. This effect was observed in the end-state results since the singleplex had a higher
CSED than the multiplex with M = 2, so the latter corresponds to a systems with fewer cluster
sizes. Especially at the critical point of the phase transition with respect to ω this effect will be
pronounced since there will most likely be cultural regions of many different sizes (due to some
degree of scale invariance), so if they intersect the domain sizes will be smaller. However, if there
will mostly be one cultural zone per cultural component in both layers, this effect is small. The
effect will be largest when not much dynamics has taken place, but enough so that NDi(ω) < 1
30
for i = 1, 2; most likely this will occur at the threshold that is one unit smaller than the critical
threshold. The results from the end-state on the phase transition confirm that the behavior is
dependent on ω in the same way as described here.
For more than two layers, all effects simply become stronger. For example, there will be much
more overlap within each cultural component if one considers all the pairwise overlaps between the
cultural zones in the respective layers. This was also observed in the end-state results since most
effects were monotone for increasing M .
5.3 Treatment 2: network-culture assortativity
Note that the only difference between this treatment and the previous one is the state at t = 0.
Surely, this has an effect on the end-result, but not on the dynamical mechanisms described in
the previous subsection. Therefore, no dynamic results were analyzed for this case. The most
important finding here is that assortativity largely promotes cultural convergence, but introduces
some non-monotonicity in the effects of M .
5.3.1 End-state results
The end-state results of the second treatment are exhibited in Figures 5 and 6. Note first that the
case M = 36 is trivial now; this is because pαij is either one or zero, since the cultural subspace is
one-dimensional. If a link forms, no further dynamics can take place since the agents are already
the same in that subspace; if no link forms, there will not be dynamics either. Therefore, there
can be no dynamics and therefore no cultural convergence.
As in treatment 1, 〈ND〉 is smaller for all M , but the ordering becomes less clear for large
M and the differences between consecutive M ’s become smaller as M increases in agreement with
prediction (the first observation most likely implies that the differences are very small). There is
now also nontrivial behaviour for ω = 0.62: the phase transition is less steep for most M , compared
to the first treatment.
Furthermore, SED is similar in size (there is not more uncertainty in the estimate 〈ND〉) and
the ordering in terms of M is the same, but it indicates that that the phase transition shifts to
the left for M > 2. This is in contrast to CSED which indicates that the phase transition shifts
to the left for M > 4 and shifts back for M = 18. However, the intermediate cases do not clearly
distinguish between the two values of ω. These findings show that a lower threshold is needed for
cultural convergence, especially for intermediate M .
Note also that the CSED shows similar ordering as before, but has become larger for most
M . This indicates that fewer small domains are obtained in the end-state. Finally, 〈T 〉 is much
smaller for all M .
31
0
0.2
0.4
0.6
0.8
1
0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74
<N
D>
ω
123469
121836
0
0.005
0.01
0.015
0.02
0.025
0.03
0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74
SE
D
ω
123469
121836
Figure 5: Average number of domains 〈ND〉 (left) and associated standard error SED (right) in
the end-state as a function of ω for multiple values of M (treatment 2)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74
<C
SE
D>
ω
123469
121836
0
500000
1e+06
1.5e+06
2e+06
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9
<T
>
ω
123469
121836
Figure 6: Average cluster size entropy 〈CSED〉 (left) and average number of time steps 〈ND〉
(right) in the end-state as a function of ω for multiple values of M (treatment 2)
5.3.2 Discussion
The effect of assortativity is that agents that are linked on the social network are likely to be
similar already, so the system starts in a state where there are fewer cultural zones per cultual
component (increasing the number of useful links each agent has). Furthermore, agents that have
links in multiple layers will typically also have dij < ω, so that the effect of compartmentalization
becomes stronger. Because of these two effects, more cultural convergence will occur, compared to
treatment 1. The result that 〈T 〉 is smaller for all M is intuitive in the sense that less interactions in
the layers have to take place before cultural zones overlap within cultural components, so less time
is needed for the system to reach its end-state. However, compartmentalization would increase 〈T 〉;
apparently, the first mechanism is stronger in terms of the effect on 〈T 〉. Moreover, the fact that
the CSED has become larger is also due to fact that there are fewer cultural zones per cultural
component initially.
32
However, these effects become smaller when M becomes larger, since pαij is dependent on a smaller
subset of features for increasing M , so the correlation between cultural and network links will
be smaller. This may then be partly compensated by the fact that the networks show more
clustering as M increases (see Subsection 3.1.1), which could lead to more cultural convergence,
as was discussed in Subsubsection 2.2.1. This is consistent with the end-state results that for
intermediate M the convergence is largest with respect to treatment 1, since such M are optimal
in the sense of this trade-off.
Finally, an interesting observation is that if there is no BC (i.e. ω = 1) and q is such that
there is one giant component on the network, there is only one cultural zone at the beginning
and cultural convergence is almost surely guaranteed. These conditions are satisfied for all M
except M = 36 (which is trivial anyway), so for ω = 1 the singleplex and multiplex are roughly
the same in terms of cultural convergence since both have one cultural zone from the start. Only
if the threshold decreases will differences appear. Therefore, BC is a necessary counterbalance to
assortativity: assortativity would always lead to cultural convergence if there were no BC.
5.4 Treatment 3: updating networks
In this subsection, the previous treatment is extended by letting each agent update its set of
neighbors after a successful interaction in the associated layer. In a sense, this is just a dynamical
extension of the initial assortativity. Mainly, it is found that the updating mechanism reduces
cultural convergence by decreasing opportunities for interaction with culturally more distant agents
at later times, so that the system settles down faster. In addition, there is some indication of
increased clustering in the layer networks (i.e. more like social networks) as well as correlation
between them over time, although this only occurs for ω such that the system converges partially,
while the layer networks converge to fully connected networks if full cultural convergence is present.
5.4.1 End-state results
The end-state results are shown in Figures 7 and 8. Note first that for M = 18 the system does
not fully converge for any value of ω, like the case M = 36, but shows some convergence, especially
for larger ω (not shown in the figure). In contrast, it has the largest value of 〈T 〉, indicating that
it needs many time-steps to reach its end-state. This will be explained in more detail below.
In addition, there is less convergence for all values of the threshold compared to treatment
2, which is especially pronounced for ω = 0.68. The ordering in 〈ND〉 is also less clear and
for small values of M the differences between 〈ND〉 at ω = 0.68 are biggest. Whereas first the
phase transition was steepest for large M , now the phase transition is steepest for the singleplex.
This implies that for smaller ω the network dynamics constrain cultural convergence, but when ω
becomes larger the number of cultural zones is already so small that the network dynamics do not
hinder cultural convergence. This is also consistent with the results for 〈T 〉, where the ordering is
33
very different. Before ω = 0.71 the ordering is standard, but starting at this confidence threshold,
the ordering is reversed and higher M roughly implies lower 〈T 〉. This means that for ω ≥ 0.71
network updating may facilitate the Axelrod dynamics on the multiplex. Also note that in general
〈T 〉 is somewhat smaller in the case of updating networks.
Finally, according to SED, ωc = 0.71 for the singleplex and shifts to the left more rapidly.
The values of the SED also vary more in size for different M . However, according to CSED, the
critical confidence ωc is 0.68 for all nontrivial M , except M = 1, for which it is 0.71. Note that for
ω = 0.71, the difference in CSED between ω = 0.68 and ω = 0.71 is very small. In a sense there
is no clear ωc since the phase transition is too steep. In addition, the order seems partly reversed,
with the singleplex having the lowest CSED, while M = 12 has the highest value. It seems that
the network dynamics increase the correlation between cultural zones within cultural components
even more than in the non-updating case, at least for moderate M .
0
0.2
0.4
0.6
0.8
1
0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74
<N
D>
ω
123469
121836
0
0.005
0.01
0.015
0.02
0.025
0.03
0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74
SE
D
ω
123469
121836
Figure 7: Average number of domains 〈ND〉 (left) and associated standard error SED (right) in
the end-state as a function of ω for multiple values of M (treatment 3)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74
<C
SE
D>
ω
123469
121836
0
500000
1e+06
1.5e+06
2e+06
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9
<T
>
ω
123469
121836
Figure 8: Average cluster size entropy 〈CSED〉 (left) and average number of time steps 〈ND〉
(right) in the end-state as a function of ω for multiple values of M (treatment 3)
34
5.4.2 Dynamical results
Figures 9 and 10 show the dynamical observables (both cultural and network) for a typical run
with M = 2 for both ω = 0.68 and ω = 0.71. The kurtosis of the degree distribution, κ, is divided
by 10 to keep it in the unit interval for most t. Note that from the end-state results, it can be seen
that for M = 2 the difference in end-state is large between ω = 0.68 and ω = 0.71 (although not
as large as in the singleplex case).
0
0.2
0.4
0.6
0.8
1
0 500 1000 1500 2000 2500 3000 3500 4000 4500
Obs
erva
ble
t (x100)VI(D1;D2)
WVI(D1(ω);D2(ω))ND
ND1
ND2ND(ω)ND1(ω)ND2(ω)
0
0.2
0.4
0.6
0.8
1
0 500 1000 1500 2000 2500 3000 3500 4000
Obs
erva
ble
t (x100)VI(D1;D2)
WVI(D1(ω);D2(ω))ND
ND1
ND2ND(ω)ND1(ω)ND2(ω)
Figure 9: variation of information between cultural regions V I(D1, D2), weighted variation of
information between cultural zones WV I(D1(ω), D2(ω)), number of cultural domains ND, cultural
regions (1) ND1, cultural regions (2) ND2
, cultural components ND(ω), cultural zones (1) ND1(ω)
and cultural zones (2) ND2(ω) on a typical run for (M,ω) = (2, 0.68) (left) and (M,ω) = (2, 0.71)
(right) as a function of t (treatment 3)
0
0.2
0.4
0.6
0.8
1
0 500 1000 1500 2000 2500 3000 3500 4000 4500
Obs
erva
ble
t (x100)ρ
L1
L2C1
C2κ1 / 10
κ2 / 10
0
0.2
0.4
0.6
0.8
1
0 500 1000 1500 2000 2500 3000 3500 4000
Obs
erva
ble
t (x100)ρ
L1
L2C1
C2κ1 / 10
κ2 / 10
Figure 10: network correlation ρ, link density (1) L1, link density (2) L2, clustering coefficient
(1) C1, clustering coefficient (2) C2, kurtosis of the degree distribution (1) κ1 and kurtosis of the
degree distribution (2) κ2 on a typical run for (M,ω) = (2, 0.68) (left) and (M,ω) = (2, 0.71)
(right) as a function of t (treatment 3)
35
For both ω = 0.68 and ω = 0.71 it is still true that the overlap mechanism works (since there is a
large WVI which is associated with a decrease in D(ω)i afterwards for i = 1, 2). Clearly, there is
a difference between the two values of ω, since one corresponds to almost full convergence, while
the other is the critical value. Compared to the dynamical results in treatment 1, there is no
fluctuation at all in the Di and D until they decline rapidly at the end. Another difference is that
for i = 1, 2, Di(ω) first increases somewhat after t = 0 before it starts to its decrease. Also note
that at t = 0, the values of the network properties are similar to those of an RG with p = 1/q, as
expected.
The network evolution shows a big qualitative difference between ω = 0.68 and ω = 0.71.
When ω = 0.68 it seems that the network becomes correlated (not only ρ indicates this but also
the remaining measures since they are roughly equal) and the link density stays small, while for
ω = 0.71 both networks become almost fully connected at the end, so it makes sense that ρ is large
(notice that ρ only increases at the end).
When ω = 0.71, Li ≈ Ci for i = 1, 2, so that the network closely resembles an RG for any
time t. Furthermore, note that κi becomes vary large at t = T for both i, since the variation in
the degree distribution diminishes at that point. Finally, the fact that for ω = 0.68 the two layers
have clustering coefficients that are much larger than the corresponding link densities indicates
that the graph is far from a RG and more like an actual social network.
5.4.3 Discussion
To understand the effect of network dynamics, assume first that M = 1. When network assorta-
tivity at t = 0 was introduced, it was observed that more cultural convergence took place for every
ω. This happens because there are fewer cultural zones at t = 0 and therefore it is likely that
there are fewer cultural regions in the end. With the updating rules nothing changes at t = 0 and
one may be inclined to think that the updating mechanism would reduce the number of cultural
zones as time progresses, so that there will be more cultural convergence. However, the updating
mechanism does not just add links in desirable places; it also removes links between nodes that
could have interacted later on, reducing the potential to dissolve boundaries at a later time. In
some cases, then, the updating mechanisms reduces T , thereby promoting cultural diversity.
In terms of cultural zones, the updating mechanism adds two ways of dissolving boundaries,
namely by creating links between zones within different cultural components and by doing the
same within cultural components. The first way is not robust, since no further interaction will
likely take place because of the small cultural overlap, while the second is robust. However, there
is the possibility that such boundaries would have dissolved anyway, over time. In addition, the
updating mechanism diminishes both conventional ways of dissolving boundaries by letting the
system settle down faster. Indeed, the results shown earlier indicate that for M = 1 there is less
cultural convergence for all ω, compared to the previous treatment.
36
Now, when there is multiplexity, the same still holds; for M = 2 it was observed in the dynamical
results that the number of cultural zones first increases before it starts to decrease. Indeed, this
supports the notion that links between agents that are only mildly similar are quickly severed, so
that cultural zones shrink. However, the overall effect presumably is somewhat smaller, since the
overlap within cultural components is relatively robust to the mechanism outlined in the previous
paragraph. For example, if a pair of agents (i, j) only has a link in one layer, hence can interact
and grow closer only in that layer, then the existence of a third node that may indirectly link i
and j is only facilitated by the updating mechanism; moreover, a link may appear between the two
nodes by chance. This is also apparent in the end-state results, which showed that the decease in
cultural convergence at ω = 0.68 is less pronounced for most of the multiplex cases.
As long as there is sufficient correlation between the subcultural distance and the overall
cultural distance and there are multiple layers, the negative effect of the updating mechanism is
somewhat counterbalanced by the positive overlap effect. If there are too many layers, however, the
network dynamics may counteract the ordinary dynamics, since there is little correlation between
the subcultural distance and the overall cultural distance; the probability to interact will almost
be independent of the probability to form a link, so little convergence should be expected. This is
exactly what is seen in the case M = 18
In terms of the layer networks, it was observed that the networks became correlated at a ω
for which the system did not converge fully. This makes sense, since in this case ND(ω) is large at
t = 0. Therefore, the layer networks will have some time to adjust to this clustered structure, so
that as the cultural zones start to become similar in both layers, the network will become similar
too. This does not happen when there are only a few cultural components at t = 0. It also
explains the large and almost equal Ci for both layers since the links will be present mainly within
the cultural components, which means that links between agents cluster within these components.
Both explanations rely heavily on the presence of a large correlation between subcultural distance
and overall cultural distance, so only holds for small M .
5.5 Treatment 4: structured culture
This treatment is based on the structured initial culture, explained in Subsubsection 4.1.3. It is
different from the others not only because the initial culture is such an important determinant for
the end-state behavior, but also because this treatment aims to induce competition in the multiplex
Axelrod dynamics. In the previous subsections overlap between layers was used as a mechanism
within cultural components that increased cultural convergence, by speeding up the dissolution of
boundaries between cultural zones. When mixing is present, there can be competition between
layers since in one layer an agent may be close to some other agent, while in the other layer they
are culturally dissimilar. It then becomes interesting to see whether the two agents will become
culturally identical and what consequences such a dynamic will have for the other agents in the
37
cluster; in randomly generated cultures such large differences are extremely unlikely.
In this treatment only the cases M = 1 and M = 2 are studied. The emphasis will be on
the value of the mixing coefficient µ defined in 4.1.3, since this parameter indicates the effects of
multiplexity given a fixed M : when µ = 0 the effect of multiplexity is small since the subcultural
distance between agents is similar across layers, while the opposite is the case for many pairs of
agents if µ = 1. First, it is found that a third phase is created which becomes less stable when
µ increases and the second phase transition shifts to the left, while becoming steeper. Second,
the effect of µ on cultural convergence depends heavily on the confidence threshold. Third, the
dynamics sometimes show non-equilibrium behavior at the second phase transition, letting one
layer reach its end-state, while the other layer remains active, causing the first layer to become
again active which changes the end-state.
5.5.1 End-state results
In Figures 11 and 12 the end-state results are shown. It should be noted that in this subsection
we use both 〈SmaxD 〉 and 〈ND〉 as observables since SmaxD distinguishes better between the phases
in this case. (For the other treatments, there was no real difference between the two.)
Starting with M = 1, there are three phases depending on the value of ω. The first is full
cultural divergence and the last is full cultural convergence, which are the same as before. In
between these two is a new phase, however, which consists of two cultural domains of size 50
(〈ND〉 = 2 and 〈SmaxD 〉 = 50). This makes sense since by careful choice of parameters in 4.1.3 there
exist ω’s such that the two shells are generated by the prototypes with a large cultural distance
between them so that two distinct cultural components form for some values of ω at t = 0. Note
that the phase transition between the second and third phase is not so steep, which agrees with
earlier results on realistic initial cultures (see Subsubsection 2.2.2).
These observations are similar for the case M = 2 and µ = 0. Since the mixing coefficient
µ is such that the subshells overlap perfectly accross the two layers, the global structure of the
subcultures is the same in both layers. It still holds that within the two shells the cultural sub-
vectors and network connections are different across layers, so that the original effects of having
multiplexity, as discussed in the other treatments, are present within each shell. However, due to
the fact that the cultural vectors are much closer to each other at t = 0 (1/q + (1− 1/q)b2 = 0.67
instead of 1/q = 0.17), there is much less cultural diversity compared to the random culture, so
that dynamics will be less interesting within the shells. This also agrees with the observation that
〈T 〉 is smaller than for similar M in treatment 3 for all ω.
This starts to change when µ increases from 0 to 1. In the figures, results are shown for
µ = 0.2, 0.4, 0.6, 0.8 and 1.0 (which corresponds to 50, 45, 40, 35, 30 and 25 agents for each
non-mixed shell and 0, 5, 10, 15, 20 and 25 for each mixed shell respectively). It is clear that
the effect of increasing µ has an effect that is highly dependent on ω; for ω < 0.40 there is more
38
convergence in the non-mixed case, while for ω > 0.40 the opposite is true. This is a consequence of
the fact that the second phase shifts to the left relative to the µ = 0 case, becomes less pronounced
(i.e. it exists for fewer ω) and changes quantitatively since the new phase now consists of more
clusters of smaller size. As a result the phase transition from the second to the third phase
becomes somewhat steeper as µ increases. In summary, it depends highly on the confidence level
of the system whether multiplexity increases cultural convergence; for low confidence multiplexity
increases cultural convergence, while for high confidence there is less cultural convergence.
0
0.2
0.4
0.6
0.8
1
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Obs
erva
ble
ω
<SmaxD>
<ND><CSE>SDmax
D
0
0.2
0.4
0.6
0.8
1
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Obs
erva
ble
ω
<SmaxD>
<ND><CSE>SDmax
D
Figure 11: Average cluster size entropy 〈CSED〉, average number of domains 〈ND〉, average size
of the largest domain 〈SmaxD 〉 and associated standard deviation SEmaxD for M = 1 (left) and
(M,µ) = (2, 1) (right) in the end-state as a function of ω (treatment 4)
0
0.2
0.4
0.6
0.8
1
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
<S
max
>
ω
00.20.40.60.8
1
0
20000
40000
60000
80000
100000
120000
140000
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
<T
>
ω
00.20.40.60.8
1
Figure 12: Average size of the largest domain 〈SmaxD 〉 (left) and average number of time steps 〈T 〉
(right) in the end-state as a function of ω with M = 2 for multiple values of µ (treatment 4)
When M = 2 and µ = 1, the new phase consists of four clusters of size 25. This makes sense,
since each cluster corresponds to one of four shells of size 25, each generated by one of the four
prototypes. The cultural distance between mixed shells and non-mixed shells is half the distance
of two non-mixed or mixed shells on average, so the clusters are culturally dissimilar enough for
39
fewer values of ω. In addition, the fact that in each layer the subshells are grouped in two pairs of
two in terms of subprototypes means that there is large connectivity in both layers between each
mixed and each non-mixed shells (although the combination of mixed and non-mixed is different
in each layer), so that if just one pair of agents between them is culturally similar enough, it is
likely that they have a link in one of the layers and can thus interact.
Comparing M = 1 with M = 2 and µ = 1 it is noteworthy that even though the values of
SEmaxD are similar during the second phase transition, SEmaxD > 0 during much of the new phase,
indicating that this phase is unstable. This can also be seen from 〈T 〉; looking at the graph it is
observed that when µ increases, a new peak forms in the middle of the second phase transition
that becomes the global maximum. Furthermore, the fact that 〈CSED〉 > 0 during the new phase
means that there is no ω such that in (almost) every run the end-state distribution is four clusters
of size 25, which lends further support for the instability of this phase.
Finally, note that 〈CSED〉 is positive at the second phase transition for M = 2 and µ = 1,
compared to a zero value for the singleplex. This makes sense, since in the latter case the end-
state will consist of either two clusters of 50 or one cluster of 100, both of which have a CSE of 0.
Importantly, such a situation does not seem to occur when mixing is present.
5.5.2 Dynamical results
In Figures 13 and 14 the dynamical results are shown. The confidence threshold ω = 0.40 has no
particular significance for the dynamical behavior, besides the fact that it is part of the second
phase transition. The reason for choosing it is that by chance this run has produced one of the
best examples of non-monotonicity.
This non-monotonicity corresponds to the most important observation, namely D1 first seems
to converge to a state where it has two clusters of size 50, since it remains at this level for quite
some time (i.e. the layer freezes). However, at some point the layer unfreezes and goes back to
an almost full cultural divergent state, before it again starts to converge and this time it ends up
having only one cluster of size 100. (Note that the left plot in Figure 14 shows the size of the
largest cluster as complementary information to the standard cultural observables in Figure 13.)
The cultural regions in layer 2, D2, do not show such behavior: ND2remains constant for some
time in the beginning after which it starts to decrease, although there is some non-monotonicity
before settling in the culturally convergent state. The V I(D1, D2) shows that there is an almost
constant mismatch between D1 and D2.
40
0
0.2
0.4
0.6
0.8
1
0 200 400 600 800 1000 1200 1400
Obs
erva
ble
t (x100)VI(D1;D2)
WVI(D1(ω);D2(ω))ND
ND1
ND2ND(ω)ND1(ω)ND2(ω)
Figure 13: Variation of information between cultural regions V I(D1, D2), weighted variation of
information between cultural zones WV I(D1(ω), D2(ω)), number of cultural domains ND, cultural
regions (1) ND1, cultural regions (2) ND2
, cultural components ND(ω), cultural zones (1) ND1(ω)
and cultural zones (2) ND2(ω) on a typical run for (M,µ, ω) = (2, 1, 0.40) as a function of t
(treatment 4)
0
0.2
0.4
0.6
0.8
1
0 200 400 600 800 1000 1200 1400
Obs
erva
ble
t (x100)Smax
D1(ω)Smax
D2(ω)
SmaxD
SmaxD1
SmaxD2
0
0.2
0.4
0.6
0.8
1
0 200 400 600 800 1000 1200 1400
Obs
erva
ble
t (x100)ρ
L1
L2C1
C2κ1 / 10
κ2 / 10
Figure 14: Size of the largest subcultural component (1) SmaxD1(ω), subcultural component (2) SmaxD2(ω),
cultural domain SmaxD , cultural region (1) SmaxD1and cultural region SmaxD2
(2) (left) and network
correlation ρ, link density (1) L1, link density (2) L2, clustering coefficient (1) C1, clustering
coefficient (2) C2, kurtosis of the degree distribution (1) κ1 and kurtosis of the degree distribution
(2) κ2 (right) on a typical run for (M,µ, ω) = (2, 1, 0.40) as a function of t (treatment 4)
41
In addition, ND1(ω), ND2(ω) and ND(ω) are either 1 or 2, except close to t = 0. In the first half
of the run it is often the case that ND1(ω) = 1 and ND2(ω) = 2 or vice versa, judging from the
constant surges in WV I(D1(ω), D2(ω)). Furthermore, in the left part of Figure 14 sizes of the
largest subcultural components underpin the asymmetry mentioned in the previous paragraph.
Both start out with two clusters of size 50, but in the beginning SmaxD2(ω) jumps to 1, while this only
happens much later for SmaxD1(ω). Interestingly, this starts happening around the same time as layer
1 unfreezes; the fact that the vertical red line is thick means that SmaxD1(ω) jumps quickly between
0.5 and 1, foretelling a complete turnaround in the dynamics.
The network dynamics are somewhat less interesting. First off, the network becomes fully
connected at the end of the model, as was seen before in the case of full cultural convergence.
Similarly, in the second half of the run the properties also indicate that the networks represent
RG’s at every time t, there remains zero correlation until right before the end and the κi behaves
regularly for both i.
In the first half of the run, things are different especially for the first layer. ρ becomes
significantly larger than 0.5 and L1 6= C1; this agrees with the observation that during that time
layer 1 mainly consists of 2 cultural regions, which should have C > L. Finally, the layer networks
are initially similar, but further away from a RG and have L and C much larger than in the case
of a random culture. These features at t = 0 can readily be explained by the structured culture
algorithm. Within the subshells the average distance between subvectors is 0.67, while the average
distance between subvectors from different shells is O = 1/q = 0.17, so that the average subcultural
distance is (0.67 + 0.17)/2 = 0.42, which roughly agrees with L at t = 0. It also makes sense that
C is higher than L, since the algorithm clearly clusters the agents in two subshells.
In terms of network correlation the fully mixed case is furthest from the trivial multiplex discussed
in Subsection 5.1. In the latter case there is perfect correlation between the networks, while in the
former there is no correlation in total (which is similar to the other treatments) and there are huge
differences in correlation across pairs of nodes. For example, for all pairs of nodes within non-mixed
shells there is a relatively large correlation, while there is anti-correlation in the case for mixed
shells. In addition, for links between shells of similar characteristic (i.e. mixed or non-mixed),
there is large correlation, while the opposite is true for links between mixed and non-mixed shells.
Therefore, the fully mixed case can be viewed as having the largest amount of multiplexity of all
cases in all treatments.
5.5.3 Discussion
Starting with M = 1, it was observed in the end-results that there is a new phase and therefore
also a new phase transition. The first phase transition will be when ω becomes large enough so
that some of the agents around each prototype start to interact. In the second phase, the dynamics
will in effect be two separate Axelrod models on separate shells that both converge culturally. The
42
second phase transition will occur when ω becomes so large that some cultural vectors in the two
shells start to interact and the two Axelrod models effectively couple.
The broadness of the second phase transition has to do with the fact that both prototypes
are in some sense attractors for their respective shells, so that there is a tendency for agents to
converge in a particular direction. Concretely, this means that even if there is some interaction
opportunity between agents from different shells (i.e. there is only one cultural zone), there is a
lower probability that this interaction opportunity will be useful, since one of the agents is likely
to have moved in the opposite direction before any interaction can take place. In turn this means
that many such opportunities have to exist (i.e. ω has to be high) before the two will converge with
high probability; for a range of intermediate ω’s both scenarios could occur with the probability
for both changing in favor of full convergence as ω increases, as demonstrated by an increasing
〈SmaxD 〉.
Note that there will always be some links between the two shells at t = 0, although for
small ω the probability that both dij < ω and aij = 1 is small. The updating mechanism of the
network may help linking the closest agents within each shell at any time t, so that this supports
convergence of the two shells. In addition, by assortativity there is a correlation between having
a link and being culturally similar enough to interact (i.e. there typically is one cultural zone
per cultural region for most ω). This effectively ensures that as soon as agents become culturally
similar enough they can start interacting, diminishing the importance of the network as in the
previous treatment.
For M = 2 and µ = 1, the situation is more complicated as was observed in the simulations.
There are now four shells, consisting of the various combinations of subshells; the first phase
transition is the same as before, except that the 100 agents are now distributed equally among the
four shells, so that, during this transition, they start to converge to four cultural regions of size 25.
The second phase transition then starts when interaction between the subshells begins to
occur. Recall that at t = 0 the cultural distance between Gij and Gkl is larger (twice as large on
average) if i 6= k and j 6= l compared to the other cases. Clearly, if agents can interact over this
larger cultural distance, full cultural convergence will surely occur. Therefore, this will not happen
at the second phase transition. This leaves four pairs of shells that could start to interact one of
the layers namely (G11, G12) and (G22, G21) in layer 1 and (G22, G12) and (G11, G21) in layer 2.
(Although their overall cultural similarity is the same across layers, their connectivity is not, since
e.g. G111 and G1
12 together comprise G11 so the corresponding cultural subvectors are very similar.)
Any of these pairs is a priori equally likely (i.e. before the culture is generated) and although
it might be the case that such a pair starts interacting in the other layer, this will happen only
with small probability. Suppose that such a pair begins to interact at t = 0, then after some time
the cultural vectors will have become more similar in that layer so the probability of interaction
becomes larger in the other layer (there are always some cultural subvectors with positive overlap
43
and therefore a relatively large probability of having a link), so that in the end the two shells will
become culturally identical.
The argument in the previous paragraph concerned two shells in isolation. Typically, if only
one of the pairs starts interacting at t = 0, then the end-state will consist of two cultural domains
of size 25 and one of size 50. If two pairs in the same layer start to interact, the system most likely
will end up with two cultural domains of size 50. By similar arguments, if two pairs in different
layers start to interact, the result will be one domain of size 75 and one of size 25. If more than
two pairs start interacting full convergence (i.e. one domain of size 100) is almost assured. Note
that two of these four combinations imply positive CSED so this explains some of the observations
in the end-state results. (Also, when running the model a few times for several values of ω in the
second phase transition, it was seen that the system always ended up in one of these four states,
in addition to the unstable phase with four domains of size 25.) The second phase transition,
therefore, constitutes a prolonged regime (in terms of ω) where the system can end up in a wide
variety of states. In a sense, this is symmetry breaking.
When three pairs of shells interact, interesting behavior can occur. Note that in such a
situation two pairs in one layer (layer 1, say) interact and one pair in the other (layer 2). Typically,
as a result of the dynamics in layer 1 the two subshells in layer 2 will also grow closer together
over time, since the shells that correspond to the same subshell in one layer correspond to two
different subshells in the other. If at t = 0 one of the pairs in layer 2 starts to interact, this means
that after some time layer 2 will move towards one cultural region of size 100, since there will
be interaction between all four shells in that layer (although the timing is different). However,
before this happens layer 1 may have already reached a frozen state with two cultural regions of
size 50. In addition, this convergence typically also implies that the two subshells from which the
cultural regions originate grow further apart, as measured by dmin(see Subsubsection 4.1.3), which
hampers the interaction between the original pair that started interacting in layer 2. Essentially,
this constitutes competition between the layers. In the end, layer 2 will reach full convergence,
which will cause layer 1 to unfreeze and converge fully as well.
This is exactly the dynamical behavior that was observed in the previous subsubsection. The
first layer is seen to converge to the the state with two cultural regions of size 50, while increasing
cultural similarity between the shells which leads to a single subcultural component in layer 2.
As the second layer moves towards a state with two cultural regions of size 25 and one of size
50, severe competition results in a single subcultural component in layer 1, diminished spikes in
WV I(D1(ω), D2(ω)) (spikes mostly indicate in this case that ND1(ω) = 2 which corresponds to
freezing since ND1= 2, while ND2(ω) = 1) and finally in the unfreezing of layer 1.
Note that if all four pairs started to interact in the beginning, there would be no asymmetry
between the layers and presumably no severe non-monotonicity since one layer would not have the
time to converge to a partially convergent state before it would unfreeze again.
44
In terms of the behavior observed in the dynamical results, now the Axelrod dynamics cannot
really be seen as a relaxation process anymore. In contrast, the process first moves towards
equilibrium after it moves away again. In this case the system settles down in the end, but in more
general cases with many layers and various degrees of mixing between all layers, the system may
never really settle down, which also agrees with observation in the real world. In summary, this
shows that multiplexity induces nonequilibrium behavior.
6 Conclusion
In this thesis a model of cultural dynamics was studied that incorporates many realistic features. It
was investigated what the effect of social multiplexity is on cultural dynamics, using a generalized
version of the Axelrod model, in addition to the effect of cultural evolution on social networks.
The effect of multiplexity differed somewhat for the different treatments, but in general multi-
plexity promoted cultural convergence. In a sense, then, the fact that there are more links in total
in all the layers results in more cultural convergence. Of course, multiplex links are not the same
as singleplex links, since the latter consists of compressing all the layers into one. Another general
feature of multiplexity that was observed is a form of competition between layers, resulting from
their coupling. In terms of the layer networks, in most cases the end result was something that
resembles a RG, so that they did not give a good description of social networks.
In the first treatment, multiplexity had a positive effect on cultural convergence, while in the
next treatment, more complex relationships were observed due to network assortativity. When net-
work updating was present in the third treatment, cultural convergence was reduces and interesting
network dynamics observed.
In the case of structured initial cultures more diverse behavior was shown with an additional
phase between full cultural divergence and full cultural convergence. This phase turned out to be
unstable when enough mixing was present. In the second phase transition nonequilibrium behavior
in the dynamics was shown to be the result from competition between the layers.
One interesting feature that was not discussed in this thesis, is the effect of having a more
complicated relationship between features and layers. For example, a cultural feature could be
associated to multiple layers. This will be left for future work. In addition, the presence of ordinal
cultural features would be a realistic extension of the current work. Finally, one may consider a
different function f for generating networks from the initial culture and to updating them over
time; perhaps it is possible to choose f such that the resulting networks are more realistic given
an initial culture.
Multiplex, adaptive complex systems present one of the greatest challenges to the complex
systems paradigm. This thesis shows what astonishing behavior can be exhibited by such high-
dimensional objects. Looking at the non-equilibrium behavior observed for structured cultures
with mixing, one may only wonder about the range of possibilities for real cultural systems.
45
References
[Albert and Barabasi, 2002] Albert, A. and Barabasi, A. (2002). Statistical mechanics of complex
networks. Reviews of Modern Physics, 74:47–97.
[Axelrod, 1997] Axelrod, R. (1997). The dissemination of culture: A model with local convergence
and global polarization. Journal of Conflict Resolution, 41(2):203–226.
[Barrat et al., 2008] Barrat, A., Barthelemy, M., and Vespignani, A. (2008). Dynamical processes
on complex networks. Cambridge University Press, Cambridge.
[Boccaletti et al., 2014] Boccaletti, S., Bianconi, G., Criado, R., Del Genio, C., Gomez-Gardenes,
J., Romance, M., Sendina-Nadal, I., Wang, Z., and Zanin, M. (2014). The structure and dy-
namics of multilayer networks. Physics Reports, 544(1):1–122.
[Buldyrev et al., 2010] Buldyrev, S., Parshani, R., Paul, G., Stanley, H., and Havlin, S. (2010).
Catastrophic cascade of failures in interdependent networks. Nature, 464:1025–1028.
[Castellano et al., 2009] Castellano, C., Fortunato, S., and Loreto, V. (2009). Statistical physics
of social dynamics. Reviews of Modern Physics, 81:591.
[Castellano et al., 2000] Castellano, C., Marsili, M., and Vespignani, A. (2000). Nonequilibrium
phase transition in a model for social influence. Physical Review Letters, 85(16):3536.
[Centola et al., 2007] Centola, C., Gonzalez-Avella, J., Eguiluz, V., and San Miguel, M. (2007).
Homophily, cultural drift, and the co-evolution of cultural groups. Journal of Conflict Resolution,
51(6):905–929.
[Clifford and Sudbury, 1973] Clifford, P. and Sudbury, A. (1973). A model for spatial conflict.
Biometrika, 60(3):581–588.
[Cozzo et al., 2013] Cozzo, E., Banos, R., and Meloni, S. amd Moreno, Y. (2013). Contact-based
social contagion in multiplex networks. Physical Review E, 88:050801 (R).
[De Sanctis and Galla, 2009] De Sanctis, L. and Galla, T. (2009). homophily noise and confidence
thresholds in nominal and metric axelrod dynamics of social influence. Physical Review E,
79:046108.
[Deffuant et al., 2000] Deffuant, G., Neau, D., Amblard, F., and Weisbuch, G. (2000). Mixing
beliefs among interacting agents. Advances in Complex Systems, 3(4):87.
[Flache and Macy, 2008] Flache, A. and Macy, M. (2008). Local convergence and global diversity:
The robustness of cultural homophily. arXiv:0808.2710.
46
[Gandica et al., 2011] Gandica, Y., Charmell, A., Villegas-Febres, J., and Bonalde, I. (2011). Clus-
ter size entropy in the axelrod model of social influence: small-world networks and mass media.
Physical Review, 84:046109.
[Gonzalez-Avella et al., 2005] Gonzalez-Avella, J., Cosenza, M., and Tucci, K. (2005). Nonequi-
librium transition induced by mass media in a model for social influence. Physical Review E,
72(6):065102.
[Guerra et al., 2010] Guerra, B., Poncela, J., Gomez-Gardenes, J., Latora, V., and Moreno, Y.
(2010). Dynamical organization towards consensus in the axelrod model on complex networks.
Physical Review E, 81:056105.
[Huang and Liu, 2010] Huang, L. and Liu, J. (2010). Characterizing multiplex social dynamics
with autonomy oriented computing. Lecture Notes in Computer Science, 6329:277–287.
[Klemm et al., 2003a] Klemm, K., Eguiluz, V., Toral, R., and San Miguel, M. (2003a). Global
culture: A noise induced transition in finite systems. Physica Review E, 67(4):045101(R).
[Klemm et al., 2003b] Klemm, K., Eguiluz, V., Toral, R., and San Miguel, M. (2003b). Nonequi-
librium transitions in complex networks: A model of social interaction. Physical Review E,
67(2):026120.
[Klemm et al., 2003c] Klemm, K., Eguiluz, V., Toral, R., and San Miguel, M. (2003c). Role of
dimensionality in axelrod’s model for the dissemination of culture. Physica A, 327(1-2):1.
[Klemm et al., 2005] Klemm, K., Eguiluz, V., Toral, R., and San Miguel, M. (2005). Globalization,
polarization and cultural drift. Journal of Economic Dynamics & Control, 29:321–334.
[Lanchier, 2012] Lanchier, N. (2012). The axelrod model for the dissemination of culture revisited.
The Annals of Applied Probability, 22(2):860–880.
[McConnell, 2011] McConnell, A. (2011). The multiple self-aspects framework: self-concept rep-
resentation and its implications. Personality and Social Psychology Review, 15(1):3–27.
[Meila, 2003] Meila, A. (2003). Comparing clusterings by the variation of information. Learning
Theory and Kernel Machines, 2777:173–187.
[Palchykov et al., 2014] Palchykov, V., Kaski, K., and Kertesz, J. (2014). Transmission of cultural
traits in layered ego-centric networks. Condensed Matter Physics, 17(3):1–10.
[Pfau et al., 2013] Pfau, J., Kirley, M., and Kashima, Y. (2013). The co-evolution of cultures,
social network communities, and agent locations in an extension ox axelrod’s model of cultural
dissemination. Physica A, 392:381–391.
47
[Quattrociocchi et al., 2014] Quattrociocchi, W., Caldarelli, G., and Scala, A. (2014). Opinion
dynamics on interacting networks: media competition and social influence. Scientific Reports,
4(4938).
[Stivala et al., 2014] Stivala, A., Robines, G., Kashima, Y., and Kirley, M. (2014). Ultrametric
distribution of culture vectors in an extended axelrod model of cultural dissemination. Scientific
Reports, 4(4870).
[Valori et al., 2011] Valori, L., Picciolo, F., Allansdottir, A., and Garlaschelli, D. (2011). Recon-
ciling long-term cultural diversity and short-term collective social behavior. Proceedings of the
National Academy of Sciences of the United States of America, 109(4):1068–1073.
[Vazquez and Redner, 2007] Vazquez, F. and Redner, S. (2007). Non-monotonicity and divergent
time scales in axelrod model dynamics. Europhysics Letters, 78(1):18002.
[Vilone et al., 2002] Vilone, D., Vespignani, A., and Castellano, C. (2002). Ordering phase transi-
tion in the one-dimensional axelrod model. The European Physical Journal B, 30:399.
48
Top Related