Download - Social multiplexity in a generalized Axelrod model …Social multiplexity in a generalized Axelrod model of cultural dissemination Arjen Aerts Abstract Social multiplexity is a ubiquitous

University of Amsterdam

MSc Theoretical Physics

Master Thesis

Social multiplexity in a generalizedAxelrod model of cultural dissemination

Author:

Arjen Aerts

Supervisor:

dr. Diego Garlaschelli

April 8, 2015

Social multiplexity in a generalized Axelrod model of cultural dissemination

Arjen Aerts

Abstract

Social multiplexity is a ubiquitous feature of human social life. In this thesis it is investigated

what the effect of social multiplexity is on cultural dynamics in terms of cultural convergence,

using a generalized version of the Axelrod model which incorporates network multiplexity and

bounded confidence. This is mostly a computational study; where possible, analytical results are

established as well. First, in the end-state the effect of having multiple networks on a phase

transition, with the confidence threshold as control parameter, is studied for the following

scenarios: random graphs, network-culture assortativity, updating networks and initial realistic

cultures. Second, using the same scenarios, the model dynamics are explicitly analyzed for some

values of the threshold. Third, attention is paid to the effect of cultural evolution on the

underlying social networks in the presence of network updating. It is found that the effect of

multiplexity differs between treatments, but in most cases promotes cultural convergence. An

important mechanism is that local differences in connectivity between the layers lead indirectly

to more cultural convergence, while increasing the time to reach the end-state. When

assortativity is present, the effect of multiplexity becomes non-monotone. Network updating

reduces cultural convergence and induces network dynamics that strongly depend on the

confidence threshold. In the case of realistic initial cultures more diverse behavior is shown with

an additional phase between full cultural divergence and full cultural convergence. This phase

turns out to be unstable when multiplexity is present. Moreover, in the second phase transition

non-equilibrium behavior in the dynamics is shown to result from competition between the

layers. Finally, in most cases the layer networks did not resemble realistic social networks.

Contents

1 Introduction 4

2 Literature review 5

2.1 Original Axelrod model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Extensions of the Axelrod model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2.1 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2.2 Other extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3 Social multiplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 The model 12

3.1 Network-culture assortativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.1.1 Properties of the network . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.2 Generalized Axelrod model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.3 End-state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4 Simulation set-up 14

4.1 Routes of investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.1.1 Clusterings of the culture: global and layer dependent . . . . . . . . . . . . 16

4.1.2 Random initial culture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.1.3 Structured initial culture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.2 End-state analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.2.1 Cluster size entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.3 Dynamical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.3.1 Network observables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.3.2 Variation of information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5 Results and discussion 24

5.1 Treatment 0: trivial multiplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.1.1 End-state results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.1.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.2 Treatment 1: random culture and random, static networks . . . . . . . . . . . . . 26


5.2.2 Dynamical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5.2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

5.3 Treatment 2: network-culture assortativity . . . . . . . . . . . . . . . . . . . . . . . 31


5.3.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.4 Treatment 3: updating networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33



5.4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.5 Treatment 4: structured culture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37



5.5.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6 Conclusion 45

1 Introduction

The existence of systems with multiple layers of networks interacting with each other can have

profound impact on the dynamics of the system. Indeed, in 2003 a powerful electricity failure

hit Italy which was particularly strong because the electricity network was coupled to the internet

communication network [Buldyrev et al., 2010]. Multiple interacting networks or, more specifically,

multiplices (i.e. networks that have different types of connections in different layers) are ubiquitous

features of complex systems and social systems are no exception [Boccaletti et al., 2014].

In social science, multiplexity is an important area of study, since interaction in social systems

is often within different (evolving) social environments. However, due to their complexity they have

not been studied extensively, short of some recent examples [Quattrociocchi et al., 2014,Palchykov

et al., 2014]. On the other hand, much work has been done on studying the Axelrod model,

a stochastic cellular automaton based on principles from social science that describes cultural

dynamics (i.e. the evolution of culture), where culture is represented as a stylized object [Axelrod,

1997,Castellano et al., 2000,Klemm et al., 2003b].

This thesis will mainly investigate the effect of social multiplexity on cultural dynamics in

terms of cultural convergence, using a generalized version of the Axelrod model. In addition, some

attention will be paid to the effect of this cultural evolution on the underlying social networks. The

thesis uses an interdisciplinary statistical-physics approach to understand these aspects. Studying

cultural dynamics is important for understanding the causes of cultural diversity and collective

social behavior.

The generalized Axelrod model that is used here incorporates the presence of network multi-

plexity and bounded confidence, in addition to optional features: network assortativity, updating

networks and empirically realistic initial cultures. It aims to model the interaction of social and

cultural dynamics, with an association between social networks and subcultures. Since it is diffi-

cult to study the model analytically even in its simplest form, simulation will be used to study the

model. An advantage of this is that many different scenarios can be studied in a similar way. In

addition, analytical results will be established where possible.

Note that an example of social multiplexity has already been studied in the context of the

Axelrod model by coupling the Axelrod model with social resource sharing dynamics [Huang and

Liu, 2010]; this is a form of social multiplexity that does not imply having multiple networks. In

the current work, social multiplexity is not of this kind, because the multiplexity is part of the

Axelrod model and implies having multiple networks. When referring to social multiplexity, it is

always meant here that the multiplexity concerns multiple networks (or layers).

For different regions of the model’s parameter space, both the end-state and the dynamical

behavior are investigated in terms of the effect of having multiple layers. More specifically, the

effect of multiplexity on the phase transition (with the confidence threshold as control parameter)

is studied for the following scenarios: random graphs, network assortativity, updating networks

4

and initial realistic cultures. In addition, the effect of the Axelrod model on the structure of the

networks is also investigated when the networks update.

The rest of the thesis is outlined as follows. In Section 2 parts of the literature on the Ax-

elrod model are reviewed. Then, Section 3 will present the model, while we elaborate on the

simulation set-up in Section 4. The results and discussion are covered in Section 5. More specifi-

cally, Subsection 5.1 finds that the generalized Axelrod model introduces a compartmentalization

that has some effect on its dynamical behavior. Subsection 5.2 shows that in the simplest case

multiplexity leads to more cultural convergence, but this originates from multiple effects, some of

which counteract cultural convergence. The most important mechanism is that local differences in

connectivity between the layers lead indirectly to more cultural convergence, while increasing the

time to reach the end-state. In addition, Subsection 5.3 finds that assortativity largely promotes

cultural convergence, but introduces non-monotonicity in the effect of multiplexity. Updating net-

works, investigated in Subsection 5.4, reduce cultural convergence by decreasing opportunities for

interaction with culturally more distant individuals at later times, so that the system settles down

faster. Furthermore, for some values of the confidence threshold there is an indication of the for-

mation of realistic networks over time. Then, in Subsection 5.5 it is found that the extra phase

that is created when using the structured initial culture becomes less stable when multiplexity

increases. Also, the effect of multiplexity on cultural convergence depends heavily on the confi-

dence threshold. Moreover, for some parameter values the system sometimes shows non-monotonic

dynamical behavior, paving the way for systematic non-equilibrium dynamics in the generalized

Axelrod model. Finally, Section 6 concludes the thesis.

2 Literature review

The model that will be discussed in Section 3 builds on several modeling paradigms. In this section

their place in the literature is discussed and some notation will be introduced.

2.1 Original Axelrod model

Cultural dynamics should strictly be viewed as a (generalized) opinion model (i.e. a model that

describes the evolution of agents’ opinions over time). In the literature, however, a distinction is

often made between opinion dynamics and cultural dynamics, where the former concerns opinion

models that use scalar variables, while the latter treats opinion models that have a vector of such

variables [Castellano et al., 2009]. Prominent examples of (scalar) opinion models are the Voter

model [Clifford and Sudbury, 1973] and the Deffuant model [Deffuant et al., 2000], where the

former treats an opinion as a binary variable (similar to the Ising model) and the latter treats an

opinion as a continuous variable, taking values in the unit interval.

The most widely used framework for studying cultural dynamics is the Axelrod model [Axelrod,

5

1997]. It generalizes the voter model in two ways: first, it has more than one opinion variable (called

cultural features and the full set of cultural features is a cultural vector) and, secondly, it allows

more than two values per cultural feature (called cultural traits). The dynamics (which will be

outlined later in this subsection) are based on the assumption that similar (in terms of the cultural

vectors) individuals interact more (homophily) and interaction leads to higher similarity (social

influence), which are fundamental principles in social science. Much like the dynamics used for the

Ising model (e.g. Glauber, Metropolis), this constitutes a relaxation process, whereby the system

converges steadily towards an equilibrium.

In the original Axelrod model the most fundamental object is the cultural configuration (or

culture), which consists of elements that live in a cultural space. Formally, a cultural space is a

finite, discrete space (and, therefore, it is compact); an element from this space is called a cultural

vector v = {v1, ..., vF }, where F is the dimension of the space, vk is called a cultural feature and

vk ∈ {1, ..., q} for k = 1, ..., F and q is some positive integer (i.e. each feature has q possible traits).

A cultural feature could be any cultural property, for example the agent’s religion or its taste for

wine. The state-space of the model (i.e. the culture) then consists of N actors (or agents), where

each agent i is associated to a cultural vector vi. Note that the number of possible states the

system can be in is qFN , which is finite, but very large for even moderate values of the parameters.

Define on this space a metric (or cultural distance) between agents i and j by dij := d(vi, vj) :=∑Fk=1 d

k(vki , vkj )/F , where dk ∈ [0, 1] =: I is the cultural distance between two features (i.e. it is

normalized; infinite distances are not possible). Note that by definition it also holds that dij ∈ I.

In the original Axelrod paper [Axelrod, 1997], the variables were of nominal type, which means

that dk(vki , vkj ) = 1 if vki = vkj and zero otherwise. This is convenient for modeling purposes and

realistic if there is no order in the variables. In this case, the cultural space has no boundaries and

no center. However, if there is order it is more realistic and necessary for empirical research (i.e.

questionnaires) to use ordinal variables, so that the distance between vki and vkj is a function of

|vki − vkj |/(q − 1).

In the original Axelrod model [Axelrod, 1997], it is assumed that the agents are organized on

a two dimensional lattice. At every time step the following actions occur in the indicated order:

(i) An agent i is selected at random; (ii) A neighbor of i, j, is selected at random; (iii) The agents

interact with probability equal to their similarity oij = 1 − dij (homophily); and (iv) Interaction

consists of i copying a random feature vkj of j for k such that vki 6= vki (social influence).

Note that the Axelrod model is a stochastic dynamical system (i.e. stochastic process) that

starts at time t = 0 from initial condition (vi(0))i∈{1,...,N} =: v(0) and its dynamics are specified

by the updating mechanism. More specifically, it is a time-homogeneous Markov Chain that is

absorbing, which means that the system always ends up in an absorbing state, a state in which the

system will remain for all later time. The culture will converge to an absorbing state at an end-

time t = T in which every agent has the same cultural vector (full cultural convergence), to a state

6

with multiple clusters of identical agents with different cluster sizes (partial cultural convergence

or cultural divergence) or to a state with N singleton clusters (full cultural divergence). In the

original Axelrod model, v(0) is generated by sampling each cultural feature from each cultural

vector from a discrete uniform distribution (taking values in [1, q]), which will be referred to as the

random culture. Although some attempts have been made to study the Axelrod model analytically,

using Markov process theory [Lanchier, 2012] or a master equation approach [Castellano et al.,

2000,Vazquez and Redner, 2007,Vilone et al., 2002], these typically rely heavily on approximations

and cannot capture the full extent of the model.

When analyzing the Axelrod model, one usually studies observables that condense the state-

space information. In the original paper [Axelrod, 1997], the author distinguished between cultural

regions, which are groups of adjacent nodes with identical features, and cultural zones, which are

groups of adjacent nodes that have positive overlap in their cultural features (i.e. the possibility

to interact). When talking about a clustering of the set of agents, we say that a condition has to

hold between a pair of nodes (e.g. a cultural distance of zero), but what is really meant is that two

nodes are part of the same cluster if there is a path between the nodes such that each consecutive

pair on this path satisfies this condition. Note that in the case of having a cultural distance of

zero, each pair in the corresponding cluster has this property because it is an equivalence relation,

but in general this is not the case.

It was Axelrod’s observation that cultural convergence/divergence depends on the initial state

of the system and whether the boundaries of the cultural zones dissolve before the process of

homophily and social influence (i.e. the dynamics) settles down. Boundaries between cultural re-

gions within cultural zones tend to dissolve as well, due to the randomness inherent in the updating

mechanism. The initial state (and therefore the dynamics between cultural zones) changes as a

result of parameter changes; Axelrod, therefore, used these concepts to explain cultural conver-

gence/divergence for different parameters. In particular he looked at q, F , number of neighbors

on the lattice and N .

Applying this reasoning to the system, it is clear that when q is large there are more cultural

zones initially and thus more cultural regions in the end-state, hence more cultural diversity.

Similarly, when F is large there are fewer cultural zones initially, so more cultural convergence.

When the number of neighbors is large, there are also fewer cultural zones in the beginning, hence

more convergence. Finally, the effect of system size is most interesting. For small N , there are

not many different cultures to begin with, so the number of domains in the end-state is small.

However, for large N there is another effect that counteracts this one. If the system is large it

takes a long time for it to settle down. Therefore, there are more opportunities for the boundaries

(both regions and zones) to dissolve, resulting in fewer regions in the end-state. The effect of

varying N is therefore non-monotone.

Later, the model was studied extensively by the statistical physics community [Castellano

7

et al., 2009]. [Castellano et al., 2000] studied a (nonequilibrium) phase transition with q as a

control parameter for various values of F . Clearly, the two phases of such a one-dimensional phase

transition are the ordered state (full cultural convergence) and the disordered state (full cultural

divergence). It was observed that the phase transition is continuous for F = 2, but discontinuous

for larger F . In terms of dynamics, it was seen that the number of active links (i.e. pairs of agents

that have 0 < dij < 1) showed non-monotonic behavior, increasing rapidly and then decreasing

again; this effect is especially pronounced around the phase transition. In general, studying such a

phase transition is interesting in itself but looking at the behavior for different values of a parameter

also makes it easy to identify the interesting parameter regions (i.e. around the critical value).

Another study [Guerra et al., 2010] showed that consensus is reached much faster for most single

cultural features than for the entire culture. Presumably, there is monotonicity in most of the

cultural features but not in the culture as a whole. A seperation of time-scales occurs, with a few

bottleneck features.

2.2 Extensions of the Axelrod model

There have been extensions of the Axelrod model, incorporating cultural drift [Klemm et al.,

2003a,Klemm et al., 2005], mass media [Gonzalez-Avella et al., 2005] and the role of dimensionality

[Klemm et al., 2003c]. Cultural drift implies that there is a (small) probability that agents change

the values of their cultural features without social interaction; clearly, this mechanism destabilizes

culturally divergent states. Secondly, introducing mass media means there is a global field that that

influences all agents simultaneously, so that the Axelrod model does not consist of local interactions

only. Finally, Axelrod suggested a 2D lattice as an interaction structure, having a geographical

space in mind [Axelrod, 1997], but when thinking of the interaction structure more as a social

space, it becomes interesting to consider the effect of dimensionality. Some other extensions will

be discussed in detail below, as they will be important for the model discussed in Section 3 .

2.2.1 Networks

Since the state space consists of finitely many agents, any graph (or network) structure can be

imposed on it. Clearly, such a network then represents a social space (i.e. a social network), so

that only agents with social ties can interact. Formally, a graph is an ordered pair (V,E), where

V is the set of nodes {1, ..., N} and E is a set of edges or links, where each link is related to two

nodes (i.e. a link is an object that links two nodes). (Note that in this thesis only unweighted,

undirected graphs with no self-links are considered.) A network can also be represented by an

adjacency matrix A, which is an N by N matrix, with elements aij := A[i, j] = 1 if i 6= j and they

share a link, but zero otherwise. Note that A is symmetrical and has a diagonal of zeros.

Before going further, some network measures are defined (most of the notation and definitions

in this paragraph and the next are from chapter 1 and 3 respectively in [Barrat et al., 2008]). Let

8

ki be the degree (i.e. number of links) of node i. First, the link density L is defined as the total

number of links E divided by the total possible number of links, namely N(N−1)/2. Secondly, the

degree distribution P (k) of a network is defined as as the probability that a randomly picked node

i has degree k. Note that by definition the average degree is 〈k〉 = 2E/N . Thirdly, the clustering

coefficient is defined as

C =1

N

N∑i=1

∑jl aijajlali

ki(ki − 1).

For each node i the number of links between the neighbors of i is divided by the maximum possible

number of such links; the clustering coefficient is the average of this quantity over all nodes.

Finally, the number of connected components is the number of groups of nodes such that there is

no link between any of the groups; a measure of the connected components is the size of the giant

component (or largest connected component) G. When used in this thesis, network measures will

usually be normalized if this is not already true by definition (e.g. G is divided by N such that

G′ := G/N = 1 when the network has only one connected component).

One of the simplest type of networks is a regular lattice, as used in the original Axelrod

model. Another network model is the Random Graph (RG), which will be used lateron and is

generated as follows. Each pair of nodes (i, j) will share a link with probability p and this is done

independently over all pairs. Such a graph has several properties. We have that 〈E〉 = N(N−1)p/2,

so 〈k〉 = (N − 1)p. In addition, when 〈k〉 is larger than one, the graph will have one connected

component, but when 〈k〉 is smaller than one, it will have many small subgraphs. Note that this

result hold for N → ∞ and is approximately true for large N . Furthermore, 〈C〉 = p and its

degree distribution is approximately the Poisson distribution with parameter 〈k〉. This means that

the random graph has a narrow degree distribution (i.e. most nodes have ki ≈ 〈k〉). Graphs that

have a large clustering coefficient are called small-world graphs, while graphs that have a fat-tailed

(i.e. broad) degree distribution are called scale-free graphs. A typical measure of the extent to

which a degree distribution is fat-tailed is the kurtosis (or scaled, centered fourth momenth) of

this distribution. A kurtosis larger than three typically indicates fat tails. Small-world and scale-

free graphs are typically referred to as complex networks and many real networks are complex.

Specifically, most social networks are small-world, although it is not clear whether they are typically

scale-free [Klemm et al., 2003b].

In line with the general interest in (complex) networks from the statistical physics community

[Albert and Barabasi, 2002], there has been some work studying the Axelrod model on RG’s,

small-world networks and scale-free networks [Guerra et al., 2010,Klemm et al., 2003b]. Typically,

it is found that these type of networks facilitate cultural convergence compared to regular lattices

(while keeping the number of links fixed), for distinct reasons. As the amount of randomness is

larger in the case of an RG compared to a lattice, there is a larger probability that traits spread. A

larger clustering coefficient means that networks are better connected locally which will facilitate

9

the Axelrod dynamics (which consists of local interactions); clearly the clustering coefficient should

not be too large, since otherwise different clusters will become distinct cultural regions, enhancing

cultural diversity. Finally, when a network has a fat-tailed degree distribution there are some

nodes, called hubs, with many links that are efficient in the spreading of traits. Note that the

results for the small-world and scale-free networks depend on the specific network models used.

2.2.2 Other extensions

Three other extensions will also be important. First, the principle of Bounded Confidence (BC)

has been used in the Axelrod model [Flache and Macy, 2008, De Sanctis and Galla, 2009]. In the

original Axelrod model it was assumed that agents could interact if they had positive cultural

overlap. However, in reality agents may only interact when the overlap exceeds a threshold θ.

Note that if θ = 0 the original Axelrod model is recovered. Such a parameter may very well

be specific to the individual, so this represents a simplification. However, part of an individual’s

level of trust may be caused by certain macro events, so that this is similar to other agents. In

addition, it appears that BC reduces cultural convergence and makes the model immune to cultural

drift [De Sanctis and Galla, 2009]. Finally, the threshold may be used to define a cultural graph,

as follows: for each pair of nodes i and j, aij = 1 if oij > θ and aij = 0 otherwise [Valori et al.,

2011].

Since the interaction probability depends on the cultural distance between two neighbors and

the confidence level θ only enters the model through this probability, the effect of BC (i.e. a

higher θ) on the Axelrod model is that there are more cultural components (that is, connected

components in the cultural graph) at t = 0 and typically there will be more cultural zones as well

(where the cultural zone is generalized to BC).

Secondly, co-evolution of network and agents (i.e. dynamical networks) has been implemented

in the context of the Axelrod model [Pfau et al., 2013, Centola et al., 2007]. It has been shown

that this mechanism stabilizes the dynamics under cultural drift under some conditions [Centola

et al., 2007], although the updating rule is different from the one that will be used in this thesis.

Thirdly, in the Axelrod model the dynamics start with an initial culture v(0) that is completely

random, but in reality cultures can be more complicated [Valori et al., 2011]. It is possible to run

the Axelrod model on these realistic cultures [Stivala et al., 2014, Valori et al., 2011] or generate

them artificially [Stivala et al., 2014], so-called structured cultures. Using empirical data it is

required to incorporate BC in the Axelrod model and it then becomes interesting to study the

phase transition for varying θ (it is impossible to vary q in the case of empirical data). It is

typically observed that the phase transition in terms of θ is much less steep for realistic cultures

than for random cultural spaces, which is discontinuous (at least for large F , similar to the q-phase

transition when θ = 0).

Initializing the Axelrod model with a realistic culture, one might object that such a culture

10

should be the end-result of the Axelrod model, since a realistic culture is the result of a long term

phenomenon and the Axelrod model is long term as well. If the Axelrod model results in diversity

or convergence, then the resulting culture will most likely not be completely consistent with such

a realistic culture. However, one must realize that in reality there are additional processes such

as a growing population and some additional features (which are not incorporated in the specific

model at hand) that have other effects. Such effects would then explain why an empirically realistic

culture is different from the end-state.

2.3 Social multiplex

In most models of cultural dynamics, agents interact via a specific interaction network (i.e. they are

located on a network and can only interact with neighbors). Such an interaction network should be

a social network since the process of social influence takes place via social ties, which are then the

links. In reality, a social environment can often be subdivided into several distinct social networks

(e.g. work, sports, family), which is termed social multiplexity. Multiple self theory lends support

to the assertion that agents’ behavior is dependent on their social environment [McConnell, 2011].

In addition, using the full (or aggregate) social network when social multiplexity is present, leads

to inaccurate results for dynamic processes [Cozzo et al., 2013].

More generally, there has been an increased interest in the study of multilayer networks, which

are collections of networks where the nodes in each layer are associated to nodes in other layers

[Boccaletti et al., 2014]. An example of this in the context of opinion models is [Quattrociocchi

et al., 2014]. A multiplex network is a special case of a multilayer network, where each node is

only associated to its counterparts in the other layers (i.e. for practical purposes, the nodes are the

same in each layer). A model of cultural transmission has been studied in this setting [Palchykov

et al., 2014].

Formally, a multiplex is represented by a multigraph G, which is an ordered pair (V,E),

where V is as before and E is a multiset of unordered pairs of edges {G1, ..., GM} (i.e. there are

M layers). Again, this can be represented using the adjacency matrix, except that there is an

adjacency matrix Aα for each layer number α. Each cultural agent/actor can now be associated to

a node in this multigraph. It now depends on which set of unordered edges or layer Gα one looks

at, whether two agents have a social tie.

Thus far, we have not assumed any dependence between this multigraph and the cultural space

discussed earlier. With only one social network, it is clear that every cultural feature belongs to

that social space. However, this is not clear in the case of a social multigraph. Therefore, there

needs to be a correspondence between the set of features and the set of layers. That is, it should

be determined for each feature to what layer(s) it belongs. As an example consider an agent’s

preference for beer; intuitively, such a cultural feature should be associated to a social network

regarding friendship relationships. The combination of the culture and the social multigraph will

11

be referred to as the social multiplex.

3 The model

The model that will be used, also referred to as the generalized Axelrod model, has the property

that the social and cultural spaces are not one-to-one. Because of this, some additional notation

has to be introduced. First, let βα be the subset of features that contributes to layer α (from

now on, we write layer α when we mean layer Gα). Second, define a cultural subvector of an

agent i as the collection of traits(vki)k∈β , obtained by restricting to a specific set of features, β.

The dimension of the subspace is |β|. Then, the cultural distance in this subspace, or subcultural

distance, is defined as dβij :=∑k∈β d

k(vki , vkj )/|β|. Finally, the subcultural overlap oβij = 1− dβij .

3.1 Network-culture assortativity

The social network in each layer may be independently determined from the initial culture. Alter-

natively one can let the social networks be generated by the cultural features associated to that

layer; in reality, an agent typically has social ties to agents that are culturally similar [Valori et al.,

2011]. More specifically, the probability of a link between node i and j is pαij = f(oβαij ) for some

increasing function f : I → I. Note that in the last case, the only input to the model (once the

system parameters are specified) is the culture. In the case of multiplex networks this is a very

convenient way of generating the networks; it solves the problem of having to handpick a specific

network for each layer, which would be arbitrary. Furthermore, assortativity is complementary to

the multiplex approach since each layer is associated to a subset of features, so that there is an

assumed relation between the social and cultural spaces.

3.1.1 Properties of the network

If the network is generated using the intitial culture, it is of interest to know what the structure

of this network is. Here, the focus is on one layer, so F corresponds to the dimension of the

subspace associated to that layer. First, assume that the generation process of the initial culture

is stochastic. Then, a priori, the cultural vector of agent i is a random vector Vi that consists of

F random variables V ki . Then we get for the probability of having a link

pij = f

(1

F

F∑k=1

P (V ki = V kj )

). (1)

Note that even if all Vi have the same distribution so that pij is independent of i and j, this does

not constitute a RG, as the probabilities for different links are not independent, hence the joint

probability of all the links together does not factorize. For example, if two nodes both have large

cultural overlap with a third node, they are likely to have a larger than average cultural overlap

12

with each other, so the clustering coefficient of the graph is expected to be higher than that of an

RG.

For collections of Vi that have the same distribution, the argument of f in (1) can be inter-

preted as the average cultural distance between those agents after the culture has been generated.

Intuitively, the reason that the resulting (sub)graph is not a RG is that the cultural distances are

not exactly equal to this average, but fluctuate around it. Assuming certain regularity properties

of the culture generation process it holds that all cultural distances converge to this average as

F →∞. Therefore, if F is large enough, the resulting graph will approximately be a RG. How large

F has to be depends on the specific generation process. Note that the condition that the random

variables V ki are independent over k is sufficient but not necessary for the cultural distances to

converge.

3.2 Generalized Axelrod model

The generalized Axelrod model is similar to the original Axelrod model (which includes the fact

that the cultural features are of nominal type), but there are some modifications. Before any pair

of neighbors is selected randomly, a layer α is selected with probability 1/M . The interaction

probability is modified to rij = oijΘ(oij − θ), where θ is a predetermined threshold and Θ(·) is the

Heaviside function, defined as Θ(x) = 0 if x ≤ 0 and 1 if x > 0 (Bounded Confidence). Furthermore,

interaction consists of the original agent copying one of the features vβαi of his neighbor, in which

the two differ. Finally, an optional feature of the model is that if the interaction is successful, the

original individual updates its links with respect to all other agents in that layer according to the

same rule that generated its links in that layer at t = 0 (see previous subsection). More specifically,

if agent i has a successful interaction in layer α, then agent i’s links in layer α are deleted and with

probability pαij = f(oβαij ) it will have a link with agent j for all j 6= i, where f is the same function

as used in the generation of the multiplex at t = 0. Note that updating of the network according to

this rule essentially implements network assortativity dynamically. Also, if updating is included,

the generalized Axelrod model operates on the entire social multiplex (i.e. the state-variable is the

social multiplex), instead of just the culture.

The updating of the social network makes sense, since the cultural dynamics are long term

and are therefore expected to be on the same time scale as social network formation. Also, the

specific updating rule is intuitive: if a node changes one of its features this is a significant event, so

it makes sense that an agent reevaluates its surroundings in the corresponding social environment.

Note that if a specific feature is associated to multiple layers, then it may be reasonable to update

the links of this node in all corresponding layers, since the agent has changed culturally in all

these layers. However, since the actual interaction took place in a specific layer, it seems that the

dynamic process should only apply to the social environment where the interaction took place.

The principle of homophily operates on the level of the full cultural space and social influence

13

only operates on the level of the social network, so that the generalized Axelrod model extends these

principles to a multiplex context. In contrast, the fact that links are generated (and updated) based

on the subcultural similarity might also be regarded as homophily, but this time it only regards

the cultural subspace. However, in the first case the object of change is the culture, while in the

second case the object of change is the social network.

From the dynamical rules it follows that one feature that causes coupling is the fact that

rij depends on the full cultural distance; if it would only depend on the subspace distance and

each feature would map to only one layer, there would be M independent Axelrod models. In

addition, if one feature maps to multiple layers, this also introduces a natural coupling between

layers (strictly speaking, the object that associates features to layers is then a correspondence since

it would not be well-defined as mapping).

3.3 End-state

In the case of the original Axelrod model, the end-state (i.e. the absorbing state) is a state where

between adjacent nodes i and j the cultural overlap (oij) is either 1 or 0. In the case of BC

this translates into the same thing, but with oij = 1 or oij ≤ θ. Clearly, there can be no more

exchange of cultural features in such a state. However, when there are multiple layers, the notion

of adjacency is different; two nodes can be considered adjacent if aαij = 1 for at least one α. If for i

and j, dβαij = 0, there can be no interaction between them anymore. Therefore, an absorbing state

in the multiplex is characterized by the following: for all pairs of nodes (i, j) that have oij > θ,

there should be no α such that aαij = 1 and dβαij > 0.

4 Simulation set-up

Before delving into the specifics, some general remarks are in order regarding the simulations. First,

the approach to study the Axelrod model is empirically realistic, which has several implications.

First, it makes no sense to change q as if it were a control parameter, like temperature, because

q is a property of the system. The same holds for F . Of course, it is still interesting to look at

different values of q and F , but not at a phase transition with respect to these parameters. Second,

the control paramater θ can be viewed as an inverse measure of trust or confidence in a society;

the larger the value of θ the less confidence there is. It may therefore be more convenient to look

at the control parameter ω = 1− θ, which can be viewed as the (normalized) level of confidence in

a society.

The critical threshold of a phase transition, ωd, can be defined as the ω such that the standard

error or standard deviation of the order parameter averaged over different runs is largest [Klemm

et al., 2003b]. In addition, the Cluster Size Entropy (CSE), which is defined in Subsection 4.2, is

typically largest at ωc, so that this measure defines ωc as well. Usually, the two will agree, but

14

when they do not, emphasis is on the CSE.

It is also more interesting to look at the order parameter as a function of confidence ω for

other reasons. Changing the value of ω form 0 to 1 has the property that when ω = 0, there will

always be full cultural divergence. Furthermore, if q is such that for the normal Axelrod model

(i.e. ω = 1) there is cultural convergence, changing ω from 0 to 1 means going from cultural

diversity to cultural convergence. Note that for small q this is usually the case and that small q

corresponds to a realistic cultural system. Also note that q is the main factor in determining the

network topology in the case of network assortativity as is explained in Subsection 3.1.1, so that

varying q would have a double effect on the model, which is undesirable.

Similarly, most realistic systems have large F . Therefore, we will generally look at systems

with small q, large F and changing ω. As the threshold is always compared to the cultural overlap

and there are only F + 1 possible values for the overlap, there are only F relevant values for ω to

study. The collection of disjoint sets for which each ω is equivalent is [0, 1/F ], (1/F, 2/F ], ..., ((F −

2)/F, (F − 1)/F ], ((F − 1)/F, 1]. Each set is referred to as an equivalence class and an element

from such a class is a representative. We will typically use the midpoint value as a representative,

rounded to two decimal places.

A small number of agents (N = 100) is used, since the complexity of the model means running

times are long. Furthermore, the correspondence between features and layers will be a symmetric

mapping, which means that if there are M layers, there are M ∗Z features for some positive integer

Z and Z features map to each layer. In terms of the cultural parameters, we take (F, q) = (36, 6)

for all simulations. The value of F is convenient since it is large enough to be empirically realistic,

has nice numerical properties since there are 9 pairs of (M,Z) that multiply to make 36 and it

is not too large (which would cause running times to be even longer). The value of q is chosen

to be small, so that for ω = 1, the model reaches cultural convergence and a phase transition

exists. Finally, note that some of the simulations have also been done using other, similar values of

(N,F, q) but the results were not qualitatively different from the results obtained using the original

parameter values.

4.1 Routes of investigation

Even when (N,F, q) is fixed, in addition to the mapping from features to layers, there are still

many degrees of freedom for which we can study the phase transition in ω. First, one can vary

the number of layers. Second, the function f that determines the connection probability has not

been determined. Third, the network can be static or dynamic. Finally, the initial culture can be

generated in multiple ways.

As increasing the number of layers increases the multiplexity of the system, this lies at the

heart of the investigation and will be done in every treatment. In terms of the remaining degrees

of freedom, we distinguish between the following treatments per degree of freedom: f(oβαij ) = p,

15

where p ∈ [0, 1], versus f(oβαij ) = oβαij ; random culture versus structured culture; and no updating

networks versus updating networks. This would give rise to eight treatments in total, but only

four of these are investigated. The simplest treatment is the one where the networks are generated

by the RG algorithm, the culture is random and the network does not update. In the second

treatment we do the same except we add assortativity. In the third treatment, we do the same

as in the second except that networks are allowed to update. Finally, the last treatment is the

opposite of the first. In this way, the complexity increases at every step.

With respect to the outcome of the general Axelrod model, several observation have to be

made. First, the system can be studied by looking only at the end-state or by explicitly looking at

the dynamics, both of which will be done and are explained in the next two subsections. Second,

the dimensionality of the system is so high that the only way to study it is to study clusterings

(e.g. cultural regions) and, more specifically, the vector of cluster sizes. Indeed, it is not important

what values the specific traits in each cultural vector in each cluster have, nor which vector is in

which cluster, since these details provide no information regarding cultural convergence/divergence.

Next, it will be discussed what type of clusterings will be used (both in the end-state and dynamical

analysis).

4.1.1 Clusterings of the culture: global and layer dependent

When analyzing the generalized Axelrod model, some clusterings give important information

regarding the structure of the culture. On the global level (i.e. the level of the social multiplex)

it is not convenient to take into account which agents are connected to each other since this

information is encoded at the layer level. Below, both global and layer dependent clusterings are

defined for am arbitrary pair (i, j). (Note again that when we say a property has to hold between

pairs of nodes, we mean that between each pair of nodes in the cluster there is a path such that

each consecutive pair on the path has the property.)

Global clusterings

• cultural domain: dij = 0. The cultural domain could theoretically consist of multiple

collections of nodes that are not linked within any of the social networks. However, for most

parameter values this event is extremely unlikely (especially if there is network updating)

• cultural component: dij < ω. The cultural component can be seen as clusters of nodes that

have the possibility (if there would be a link in the appropriate layer) to interact and converge

culturally; with network updating, such an interaction will most likely become possible over

time in any of the layers. Note that the cultural component really is a connected component

with respect to the cultural graph

16

Layer dependent clusterings

• cultural region α: dβαij = 0 and aαij = 1. This is a direct analogue of the cultural region

defined for the original Axelrod model

• cultural zone α: dij < ω and aαij = 1. This is a direct analogue of the cultural zone defined

for the original Axelrod model; note that the cultural distance should depend on the full

cultural space, since interaction depends on the full space. When the number of cultural

regions equals the number of cultural zones in all layers, the system will be in its end-state,

since no more dynamics can take place

It is important to note that each cultural component will always encapsulate at least one cultural

zone in a specific layer, since cultural zones have a stronger requirement. The extra requirement

aαij = 1 can only reduce the number of cultural zones per cultural component (if there was no such

requirement there would trivially be one cultural zone per cultural component, since the two would

be the same). One implication that follows from this is that there will always be at least as many

cultural zones as cultural components in total.

4.1.2 Random initial culture

In this subsubsection some aspects of the random initial culture with regards to the network

structure are discussed in more detail; the next discusses the structured culture used in the last

treatment. In the case of the random culture, a priori each cultural vector Vi is a vector of F

independent discretely uniformly distributed random variables with domain [1, q]. Then, using the

results from 3.1.1, we have that the probability of having a link is pij = f(1/q) = 1/q. Therefore,

the resulting graph will have a relatively simple structure. However, it is not clear what F will be

large enough for the network to be like an RG.

We simulated the network generation process for many values of the parameters (N,F, q) (note

that here F is the dimension of the subspace associated to a layer, which means that F = Z).

In general, the resulting graph has properties (i.e. link density, degree distribution, clustering

coefficient, connected component) that closely resemble those of the RG when Z is large. When Z

is small the resulting graph has a large clustering coefficient. Both observations are in agreement

with the discussion in 3.1.1. For (N,Z, q) = (100, 36, 6), the properties already closely resemble

that of an RG. Note, however, that as the M increases Z decreases, so the F corresponding to each

layer becomes smaller. When there are many layers, Z is small, so the networks associated to each

layer will have a relatively large clustering coefficient. Therefore, when assortativity is present, M

has two effects on the dynamics: it influences the Axelrod dynamics and it effects the underlying

topology of the network. Also, it was found that when Z > 1 (N = 100, q = 6) the graph is

connected (i.e. G = 1). For Z = 1, this is not the case, but this corresponds to M = 36 and as

we show later this case shows trivial dynamical behavior when there is assortativity. Finally, N

17

should be large relative to q to obtain a connected graph, which is similar to the RG case, where

Np should be much larger than 1 to obtain the result.

4.1.3 Structured initial culture

The type of structured culture used in this thesis is based on the prototype evolution algorithm

in [Stivala et al., 2014], which initializes the culture by generating the cultural vectors around a few

fundamental cultural vectors, called prototypes. This algorithm is inspired by theories from social

science which postulate that most individuals fall in a certain cultural category and the prototypes

are then the most typical members of such a category (note that a prototype does not need to be

an actual individual).

Specifically, in each layer two prototypes are generated by the requirement that they both

have f cultural traits in common with a third superprototype (which is just a randomly generated

cultural vector), while the remaining traits are generated randomly. Around each prototype N/2

agents are generated by letting each agent have g traits in common with the prototype while

the rest is generated randomly. Letting b = g/Z , they will on average have an overlap with the

prototype of o = b+(1−b)/q. Therefore, this algorithm will produce two spherical shells of average

radius r′ = 1− o (while the radius of the outer sphere is r = 1− b). Note that the prototypes are

not included in the initial culture.

To compute the distance between the two prototypes (i.e. between the centers of the shells),

R, let first B = f/Z. For each cultural feature three things can happen: both prototypes have

cultural traits obtained from the superprototype (so they are equal to each other); only one of the

prototypes has a trait obtained from the superprototype; and both prototypes do not have traits

obtained from the superprototype. Properly accounting for all the probabilities this means that

for each cultural trait separately the probability that the two prototypes have equal trait is

O = B2 + 2B(1−B)/q + (1−B)2/q = B2 + (1−B2)/q. (2)

Averaging over all cultural features gives that the average overlap between the prototypes is exactly

the same expression, again denoted by O. If this is used as an estimate for the actual overlap it

should be taken into account that it is the average overlap and especially for small Z there will be

large fluctuations. For large Z the difference will be small; simulations confirm this. Since in our

case Z will typically be large, this is a good approximation. The distance between the spherical

shells will then (on average) be R = 1 − O. In addition, it should be noted that if B = 1, only

one spherical shell will be created with N agents, since the prototypes are the same in that case.

On the other hand, if b = 0, then the algorithm is the same as the algorithm that generates the

random culture. However, even though it is a special case, the random culture algorithm can in

principle generate any of the structured cultures (typically, however, the probability of generating

one with realistic values of B and b is very small if F , q and N are non-trivial).

18

An expression similar to (2) can also be derived for the average distance between two subcul-

tural vectors generated around the same prototype, by just replacing B by b in that equation. This

can give important information on the network structure of the nodes around each shell. Using the

results from 3.1.1 it follows that pij = b+ (1− b)/q = 1/q + (1− 1/q)b (note that this expression

reduces to the one for a random culture if b = 0, as expected).

For the following, some notation has to be introduced. Since the above discussion was about a

single layer, the spherical shell (or shell) for one layer will from now on be referred to as a subshell.

Denote by Gαi subshell i in layer α, where i ∈ {1, 2}. The prototype around which this subshell

is generated will then be referred to as the subprototype. In addition, the subcultural graph in

layer α is defined as the cultural graph with respect to the corresponding subspace (i.e. with dij

replaced by dβαij ).

Now, the question becomes how to select f and g. The most diverse behavior occurs when

for some values of ω, there are two subcultural components (that is, if there was only one layer no

interaction would occur between agents from the two subshells). In addition, the radius r should

not be too small, since otherwise there would be trivial dynamical behavior in each subshell (in

the most extreme scenario both subshells would consist of N/2 identical subprototypes). Finally,

the average distance between subcultural vectors in each subshell should be equal to the distance

between the subshells themselves. This should be true by construction, except for the fact that

there is a maximal distance in the cultural space, namely 1.

These three requirements are surely fulfilled if the following conditions are satisfied:

R ≥ 4r + δ; r ≥ ε; R+ 2r ≤ 1, (3)

where δ controls how many threshold values should exhibit the required behavior outlined in the

previous paragraph and ε determines how non-trivial the dynamics within the subshells will be.

The parameter r is set directly by g, while R is set randomly by f (since R is the average distance

between the prototypes), so some margin should be included in setting R. In practice (i.e. for

significant values of δ and ε and with Z ≤ 36), the conditions in (3) yield no feasible pair (f, g),

even if one replaces r by r′. However, these conditions are based on extremely small probabilities.

For example, the probability that two cultural subvectors are generated in the two subshells that

have the smallest distance possible (i.e. ‘lie on the line between the subprototypes’) and also

have the highest distance possible to any other agent within their respective shells, is extremely

small. Therefore some simulations were performed that computed the following quantities for each

generated structured culture:

dmin = mini∈Gα1 ,j∈Gα2

dβαij ; dmax = maxi∈Gα1 ,j∈Gα2

dβαij ; dkmin = mini∈Gαk ,j∈G

αk

dβαij

and it should at least hold that dmin ≥ dkmin + δ for k = 1, 2 and dmax ≤ 1. Based on the results

of these simulations, many combinations of (f, g) were feasible most of the time (recall that the

19

generation process is stochastic). Some additional constraints are due to the fact that there are

multiple layers, which is discussed next.

In the case of a multiplex, there are cultural subspaces associated to each layer, so it becomes

necessary to match the cultural subvectors in each layer. The straightforward way to do this is to

have a mapping between subprototypes accros layers. In effect one then has multiple prototypes,

each composed of different subprototypes and the collection of agents associated to each prototype

depends on this mapping. In general there will be 2M such prototypes. For more than two layers,

the situation becomes increasingly complex, so to keep things simple we will only use M = 1 and

M = 2 for the treatment with realitic cultures. When M = 1, there are simply two subprototypes,

which are also the prototypes.

Now, consider the case M = 2. In one extreme case, subprototype 1 in layer 1 would be

matched to subprototype 1 in layer 2 (i.e. the subcultural vectors that are generated around

prototype 1 in layer one are matched to those generated around prototype 1 in layer 2) and

subprototype 2 in layer 1 is matched to subprototype 2 in layer 2 (or vice versa), which corresponds

to no mixing. In the other extreme case, one will have that only half of the subcultural vectors

generated around subprototype 1 in layer 1 will correspond to the subcultural vectors around

subprototype 1 in layer 2, while the other half will correspond to the subcultural vectors around

subprototype 2 layer 2 (and similar for the other half of the subcultural vectors associated to the

subprototypes).

Letting Πij be the prototype with subprototype i in layer 1 and j in layer 2, this scenario

would divide the cultural vectors in four shells (cultural vectors around Π11, Π12, Π21 and Π22

respectively) of size N/4, and corresponds to the case of full mixing. Intermediate cases would also

divide the cultural vectors in these classes, but then those of Π11 and Π22 (non-mixed prototypes)

have size N(2− µ)/4, while the others (mixed prototypes) have size Nµ/4, where µ ∈ [0, 1] is the

mixing coefficient; if µ = 0, there is no mixing, while µ = 1 corresponds to full mixing. Denote

by Gij the shell associated to Πij , while Gαij is used for the same shell when restricting to the

subcultural vectors associated to layer α only. In the sequel, G11 and G22 will be referred to as

non-mixed shells, while G12 and G12 will be called mixed shells, for obvious reasons. Finally, note

that even though positive µ could consistently be used with M = 1, this has no relevance, so in

the case M = 1, it always holds that µ = 0.

Since the values of f and g should be the same for M = 1 and M = 2 and because, if M = 2,

both layers should have subshells of equal radius, f and g have to be even numbers. In addition,

to allow for all possible behavior, it is desirable that R is such that even shells that have one

subprototype in common have some distance between them. Using the results of the simulations,

it turns out that the combination (f, g) = (0, 28) satisfies all the requirements with some margin.

These parameter values will be used for generating the structured culture in Subsection 5.5.

20

4.2 End-state analysis

For the end-state analysis the only quantity that will be investigated is the collection of domain-

sizes. To compress this information into one number that also has the property of being an

order parameter (i.e. one value before the phase transition, another after the phase transition),

we typically use the normalized number of domains, which is denoted by ND. Another order

parameter is the normalized size of the largest cluster, denoted by SmaxD . Moreover, a quantity that

is typically only non-zero at the phase-transition and gives more information on the distribution

of cluster sizes is the CSE. Finally, the number of time-steps needed to achieve convergence (T )

will also be studied.

As the Axelrod model is stochastic, one can reliably study its dynamics only by employing

many runs. The number of runs (K) will be 100 for each parameter constant. The resulting

quantities will then be averages (e.g. 〈ND〉) and in this thesis the average always implies the

average over multiple runs, unless stated otherwise. In addition, the standard deviation of the

normalized number of domains SDD =√〈(ND − 〈ND〉)2〉 and the corresponding standard error

SED = SDD/√K will be computed. (For SmaxD , the standard deviation is denoted as SDmax

D and

similar for the standard error.) We look at a phase transitions in terms of ω for various values of

M . Effectively, therefore, we look at a two-dimensional phase-transition. However, we are mostly

interested in the difference between having 1 layer (M = 1 or singleplex) and having multiple layers

(M > 1 pr multiplex).

4.2.1 Cluster size entropy

For the cultural domains in the end-state, D, the number of domains ND and the largest domain

SmaxD compress a full set of cluster sizes into one number. Outside of the phase transition, this

is usually a trivial compression, but around the phase-transition, much information is lost. For a

specific clustering, the CSE is the weighted entropy over the the distribution of cluster sizes, that

is

CSE = −∑s

Ws logWs,

where Ws is the probability that an element (agent) belongs to a cluster of size s [Gandica et al.,

2011]. The CSE has a value of zero when only one type of cluster size is present and the more

cluster sizes are present, the higher it becomes; at the phase transition, when there usually is some

degree of scale invariance, it reaches its maximum (i.e. the distribution over cluster sizes is then

closest to being uniform). It is weighted, since the size of the cluster is taken integrated into the

probabilities. Otherwise, it would not be a useful measure, since e.g. a clustering of two clusters,

where one is singleton and the other comprises the rest, would give rise to an entropy of log 2,

while it is supposed to give a very low entropy.

21

This measure can be normalized as follows. The largest possible value of the entropy would occur

when all Ws have the same value and when there are as many cluster sizes as possible. These two

requirements are constrained by the fact that the total number of agents is N and that clusters are

discrete objects (i.e. half a cluster is not possible). From this it follows that√N is an upperbound

for the maximum entropy. For example, if N = 100 a maximal entropy is obtained by having 10

clusters of 1, 5 clusters of 2, 2 clusters of 5 and 1 cluster of 10, but the remaining 60 agents cannot

be evenly distributed over the remaining 6 cluster sizes, so there is less entropy compared to the

case where agents are evenly distributed over the cluster sizes.

4.3 Dynamical analysis

When studying the system dynamically, it makes sense to study the ‘most interesting’ case. Since

each system is investigated for different ω we therefore choose ω’s close to ωc. Also note that it

would not make sense to sample each time step, since too much data would be obtained. Therefore,

only once every Y time-steps a measurement is taken from the system. Unless otherwise stated,

Y = N ; this is equivalent to one (attempted) update per agent on average. Just a few runs are

observed for each paramater-set; no averages over runs will be performed for the dynamical case.

For the purpose of studying the Axelrod dynamics many observables can be computed over

time. All the observables discussed at the beginning of this section, such as the cultural compo-

nents, will be shown (that is, the number of clusters in each case); in addition, to look at the

difference between the layer dependent measures, the variation of information (to be explained

later in this subsection) is computed between both the cultural zones and cultural regions for a

pair of layers. Denote the clustering of cultural components by D(ω), cultural regions in layer i

by Di and cultural zones in layer i by Di(ω), where the dependence on ω is used to indicate the

explicit dependence of the measure on ω (note, however, that the other measures also depend on ω

indirectly since the Axelrod dynamics depends on it). The normalized number of clusters is then

denoted by NX for a clustering X. Finally, if X is a clustering, denote by X[n] its nth cluster.

Since the number of observables grows fast when the number of layers increases, only the cases

M = 1 and M = 2 are investigated. Presumably, many of the insights in the dynamical behavior

that are obtained by analyzing just two layers can also be applied to more than two layers.

4.3.1 Network observables

The Axelrod model could also have an effect on the social multiplex if the network updates over

time. Correlations may develop between the layers as a corollary to the Axelrod dynamics. This

will be investigated dynamically by measuring the correlation every Y time-steps. The correlation

between two unweighted, undirected networks can simply be computed as the the correlation

between the corresponding adjacency lists (i.e. the extent to which a link between node i and j

is present in layer 1 is matched by the same link in layer 2). Formally, the correlation coefficient

22

between layer α and γ is

ρα,γ =〈aαija

γij〉 − 〈aαij〉〈a

γij〉√

(1− 〈aαij〉)〈aαij〉√

(1− 〈aγij〉)〈aγij〉,

where it should be noted that 〈aij〉 = 〈a2ij〉, since aij is a binary variable. The normalized correla-

tion is then obtained by dividing the correlation by two and adding 0.5.

From the beginning we have assumed that the different layers represent distinct social net-

works. It is therefore interesting to investigate whether the layers actually have the properties of

social networks. It was already shown that for some configurations, the initial layer looks like a

RG, so that this does not resemble a realistic social network. A structured initial culture may re-

sult in layers that show properties of social networks like the small world and scale-free properties.

To study this, the size of the largest connected component G, the link density L, the clustering

coefficient C and the kurtosis of the degree distribution κ (as defined in Subsubsection 2.2.1) are

computed for each layer every Y time-steps. In general, if a network measure X corresponds to

layer α, this is denoted as Xα. Clearly, when the network does not update the network properties

stay the same and since they do not vary with the threshold they are the same for all runs.

4.3.2 Variation of information

A measure of discrepancy between two clusterings is the Variation of Information (VI) [Meila,

2003]. If A is a set and X = {X1, ..., Xk} and Y = {Y1, ..., Yl} are such that Xi ∩ Xj = ∅ for

all i, j and ∪ki=1Xi = A (similarly for Y ), then X and Y are clusterings (or partitions) of A. In

addition, let N = |A| and let pi = |Xi|/N , while qj = |Yj |/N . Note that pi is the probability that

a randomly picked element of A is in Xi (similarly for qj). The VI between X and Y can then be

defined as

V I(X;Y ) = H(X) +H(Y )− 2I(X,Y ),

where H(X) is the entropy of X, defined by

H(X) = −k∑i=1

pi log(pi)

(similarly for H(Y )) and I(X,Y ) is the mutual information, defined by

I(X,Y ) =

k∑i=1

l∑j=1

rij log

(rijpiqj

),

where rij = |Xi ∩Yj |/N is the joint probability of randomly selecting an element in A that is both

in Xi and Yj . It is easily seen that if the clusterings are the same, rii = pi = qi (and rij = 0

for i 6= j), so that I(X,Y ) = H(X) = H(Y ), which implies V I(X;Y ) = 0. Similarly, if the two

clusterings are completely independent, then I(X,Y ) = 0, so V I(X;Y ) = H(X) + H(Y ). The

23

V I can be conveniently normalized by dividing by log(N), since this is the maximum value that

H(X) or H(Y ) can have, which will be done in the sequel. Note that this measure can then only

be compared for systems that have the same N .

It is always the case that cultural zones Di(ω) have partly the same structure since all Di(ω)

are a refinement of D(ω), as was explained in Subsubsection 4.1.1. Here, this means that when

comparing two layers, the matrix with entries rij consists of blocks on the diagonal and is zero

everywhere else. To get a consistent measure of the variation of information, one has to take this

into account. One way to do this, is to compute for each cultural component, the (normalized) VI

seperately and then compute the weighted average over all cultural components, where the weight

is the relative size of the component. The resulting measure is normalized and called the Weighted

VI (WVI).

5 Results and discussion

As was outlined in Subsection 4.1, this simulation study focuses on four treatments. The results of

these will be shown and discussed in Subsections 5.2, 5.3, 5.4 and 5.5. Note that Subsection 5.2 is

fundamental, while the next two subsections build on the results presented there; Subsection 5.5

uses the results obtained in the previous subsections but also constitutes a different approach to

studying multiplexity and therefore differs somewhat from the rest. Table 1 presents an overview

of the four treatments. Before the actual treatments are discussed, Subsection 5.1 will go into a

trivial version of the model that is discussed in treatment 1 to show some differences compared

to the singleplex that already arise from the generalized model structure itself. Finally, it was

observed that all of the networks are connected at all times t, so G = 1 for any case we have

considered; it will not be shown each time in the results below.

Treatment Assortativity Updating Culture

1 no no random

2 yes no random

3 yes yes random

4 yes yes structured

Table 1: Characteristics of the four treatments

5.1 Treatment 0: trivial multiplex

In this subsection a useful intermediate case between a singleplex and a multiplex is discussed,

namely a multiplex that has the same graph in each layer (i.e. a trivial multiplex). The main

24

result is that there is more cultural convergence for a trivial multiplex compared to a singleplex,

due to the compartmentalization of the generalized Axelrod model.

5.1.1 End-state results

In Figure 1 results are shown for a singleplex and two multiplices (M = 2 and M = 36) that have

the same RG graph in each layer (i.e. trivial multiplices). Note that it is not possible to include

assortativity in the trivial multiplex condition, since the networks would then be generated by the

cultural subspaces and will typically be different from each other (the same holds, of course, when

networks update).

Clearly, there are differences between the three cases. First, the trivial multiplex condition

shows more convergence (i.e. lower values of 〈ND〉) for all ω than the singleplex, although the

difference is small for M = 2. Second, it is clearly the case that 〈T 〉 is larger for increasing M ,

especially when M = 36.

0

0.2

0.4

0.6

0.8

1

0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74

< N

D >

ω

12

36

0

200000

400000

600000

800000

1e+06

1.2e+06

1.4e+06

1.6e+06

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9

<T

>

ω

12

36

Figure 1: Average number of domains 〈ND〉 (left) and average number of time steps 〈T 〉 (right) in

the end-state as a function of ω for M = 1, M = 2 and M = 36 (treatment 0)

5.1.2 Discussion

A trivial multiplex implies that if two agents are connected in one layer they are connected in

every layer and vice versa. If an interaction is successful in a singleplex, the active agent randomly

chooses a feature from the set of all features in which the two differ. However, when such an

interaction occurs in a trivial multiplex, the distribution over features that differ is not the same,

since the probability to pick any such feature is 1/M times the probability of randomly selecting

that feature out of the list of features for which the two differ in the cultural subspace associated

to that layer.

In addition, if for some pair (i, j) layer α is selected, it may even occur that dβαij = 0, so that the

probability of selecting a feature for which the two differ is not just different, it is zero. Therefore,

if a pair of nodes has the same cultural overlap in the two cases, their propensity to interact is the

25

same, but if the pair has a layer with subcultural distance equal to zero (in the case of the trivial

multiplex), then there is a large probability that, even though the interaction would have been

successful, there can be no cultural influence. However, in a singleplex such an interaction would

always occur, unless the agents are culturally identical (i.e. dij = 0), in which case there is no need

for further interaction anyway. This feature introduces a kind of compartmentalization in the

generalized Axelrod model, which is not present in the ordinary Axelrod model on a singleplex.

Globally, this means that in the trivial multiplex case it will typically happen that cultural

zones (the cultural zones are the same in both layers since the network is the same) form with

some pairs of agents (i, j) that have dβαij = 0 for some α. Cultural convergence within such zones

will then take longer on account of these pairs, increasing the possibility that the cultural zone

merges with another cultural zone. The more culturally similar the zones become, the more likely

the existence of such pairs is. Because there is a larger probability that cultural zones merge, there

is more cultural convergence by arguments similar to those in [Axelrod, 1997] (see Subsection 2.1),

even without having different networks in different layers.

The more layers there are, the larger the probability of encountering a pair of nodes (i, j) in

layer α that has dβαij = 0 while dij > 0, since Z is smaller. It seems therefore, that compartmen-

talization causes cultural convergence and is associated to longer running times, as was observed

in the simulation results.

5.2 Treatment 1: random culture and random, static networks

In this section, the simulation results with respect to the first treatment are discussed. To be

consistent with the result in 4.1.2, for the RG we set p = 1/q. It will be shown that multiplexity

typically leads to more cultural convergence, but this originates from multiple effects, some of

which counteract cultural convergence. The most important mechanism is that the cultural zones,

by not overlapping perfectly between different layers, interact indirectly to produce more cultural

convergence, while increasing the time to reach the end-state.


In Figures 2 and 3, the end-state results are shown (i.e. the observables for each M). Note that,

according to both CSED and SED the phase transition is at ωc = 0.68, although for M = 1 SED

is large at ω = 0.71 as well, while for M = 36 the same holds with respect to ω = 0.65. There

seems to be a clear hierarchy at ωc, where the different scenarios are ordered almost perfectly

according to 〈ND〉 (i.e. if M > M ′, then 〈ND〉 < 〈N ′D〉); the only exception is the pair (9, 12).

Note that these are averages and since the standard error is in the range 0.01− 0.025 the ordering

result is not statistically significant for large M , especially since the large standard errors occur at

large M . Most likely, the differences between the largest values of M are small, so that the number

of runs K should be even bigger to establish a statistically significant difference. Nonetheless, it

26

is clear that multiplexity increases cultural convergence at ωc. This also seems to hold to some

extent at ω = 0.71. For, ω = 0.65, however, the singleplex shows more cultural convergence than

the multiplex cases. Together, these observations imply that the phase transition is steeper for the

multiplex cases.

Furthermore, it is clear that when M is larger, 〈T 〉 is larger as well; again, the cases are

almost perfectly ordered in 〈T 〉. 〈T 〉 seems to be largest at ωc for most M , but the singleplex case

is the only exception with ω = 0.71, which is consistent with the fact that SED was large for this

value as well. It is not surprising that T is largest at the critical threshold of a phase transition;

as the system could go either way, it needs a long time to ‘decide’. Furthermore, it is intuitive

that if M is larger, then 〈T 〉 is larger, since there will be many more interactions, mainly because

of compartmentalization. Also, in all cases 〈T 〉 is much larger on the right side of the critical

threshold and jumps to its peak at this threshold.

0

0.2

0.4

0.6

0.8

1

0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74

<N

D>

ω

123469

121836

0

0.005

0.01

0.015

0.02

0.025

0.03

0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74

SE

D

ω

123469

121836

Figure 2: Average number of domains 〈ND〉 (left) and associated standard error SED (right) in

the end-state as a function of ω for multiple values of M (treatment 1)

0

0.1

0.2

0.3

0.4

0.5

0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74

<C

SE

D>

ω

123469

121836

0

500000

1e+06

1.5e+06

2e+06

2.5e+06

3e+06

3.5e+06

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9

<T

>

ω

123469

121836

Figure 3: Average cluster size entropy 〈CSED〉 (left) and average number of time steps 〈ND〉

(right) in the end-state as a function of ω for multiple values of M (treatment 1)

27

Finally, the CSED values are also ordered according to M , where the singleplex has the largest

value. Recall from Subsubsection 4.2.1 that the CSE is largest when the distribution of cluster

sizes is as uniform as possible; the cluster sizes of the singleplex are more uniformly distributed

than those of the multiplex cases. Moreover, note that in general the range of ω for which there is

any difference between the various values of M is small, since the phase transitions are steep.

5.2.2 Dynamical results

The difference between the singleplex and multiplex is largest (in terms of 〈ND〉) at the critical

threshold. However, since for M = 2 not much cultural convergence has been achieved at this value,

the mechanisms just described manifest themselves better for ω = 0.71. Note that although differ-

ent runs typically give different results (especially around the phase transition), some qualitative

features are the same in all runs. In Figure 4, a plot is shown of a typical run when ω = 0.71.

The following observations can be made regarding the overlap between zones. ND1(ω) is

roughly equal to ND2(ω) after a long period of large VOI between the two and they are close to

ND(ω). In addition, WV I(D1(ω), D2(ω)) is small at t = T , so that the cultural zones match well

in the end-state.

0

0.2

0.4

0.6

0.8

1

0 1000 2000 3000 4000 5000

Obs

erva

ble

t (x100)VI(D1;D2)

WVI(D1(ω);D2(ω))ND

ND1

ND2ND(ω)ND1(ω)ND2(ω)

Figure 4: variation of information between cultural regions V I(D1, D2), weighted variation of

information between cultural zones WV I(D1(ω), D2(ω)), number of cultural domains ND, cultural

regions (1) ND1 , cultural regions (2) ND2 , cultural components ND(ω), cultural zones (1) ND1(ω)

and cultural zones (2) ND2(ω) on a typical run for (M,ω) = (2, 0.71) as a function of t (treatment

1)

28

Moreover, it is observed that ND(ω) typically does not decrease that much over the run of the model,

so that there is little decrease in cultural components. ND(ω) fluctuates due to randomness;

however, fluctuations are only robust if they coincide with fluctuations in the number of cultural

zones since only then can the fluctuation be sustained by interaction.

Finally, ND, ND1and ND2

are large for a long time, before they decrease. This is similar

to the results in [Axelrod, 1997]. In addition, V I(D1, D2) also becomes small at t = T (note

that the cultural regions and zones are the same at t = T by construction, but the variation of

information measures are different so they give different results). There does not seem to be much

non-monotonicity in the convergence of the number of cultural regions layers. More specifically, it

could have happened that one layer converged to its end-state (freezes), while the other would still

exhibit dynamics, which causes the former layer to become active again (unfreezes). In the graph,

this would translate into NDi(ω) = NDi for some time and then NDi(ω) 6= NDi , but this is clearly

not the case.

5.2.3 Discussion

Below some mechanisms will be discussed that may underlie the observed behavior in the results

discussed above. First, for simplicity, consider the case with only two layers. At t = 0, it will

typically be the case that there are multiple cultural components and within each cultural com-

ponent there are multiple cultural zones, which are different in each layer. As was already noted

in Subsection 2.1, boundaries between cultural regions within a cultural zone tend to dissolve over

time, just like boundaries between cultural zones. The cultural zones are defined such that when

the number of cultural zones is equal to the number of cultural regions, the dynamics in that

specific layer stop; this can be considered the temporary end-state for that layer (only interactions

in the other layer may change this). This end-state roughly depends on two things: the number of

cultural zones at t = 0 (the number of cultural regions in the end-state will typically be at most

equal to this) and the extent to which the boundaries between cultural zones tend to dissolve over

time, as was noted in [Axelrod, 1997].

The boundaries between zones can dissolve in two ways, namely within a cultural component

and between cultural zones in different cultural components. In both cases, there are pairs of

agents that have a link but do not yet have enough cultural overlap (recall that within a cultural

component not all pairs (i, j) have dij < ω). Multiplexity will increase the likelihood of both

mechanisms in different ways.

First, it makes it easier for boundaries to dissolve within cultural components, since cultural

zones within a cultural component do not match perfectly between different layers, so two zones

are ‘linked’ indirectly in one layer if some of the corresponding agents are linked in another. More

specifically, consider two nodes i and j that have a1ij = 1, a2ij = 0, dij < ω, i, j ∈ D1(ω)[m] and

i ∈ D2(ω)[n], but j ∈ D2(ω)[n′] with n 6= n′. Then they can interact and grow closer only in layer

29

1. However, suppose that as an indirect result of this interaction, another agent k will be brought

in the position of satisfying k ∈ D1(ω)[m], k ∈ D2(ω)[n], a2jk = 1 and djk < ω (or, equivalently,

replace i by j and n by n′), then the boundary between the cultural zones D2(ω)[n] and D2(ω)[n′]

has dissolved. Note that the existence of such an agent is plausible, since the cultural convergence

between i and j may cause k, which was already close to i to become close to j as well, or vice

versa. Typically, there are many candidate k’s in such a cultural zone (the only real requirement

being that it has a link to either i or j in layer 2). This overlap mechanism was illustrated in the

dynamical results above: the number of cultural zones in both layers will be close to the number

of cultural components at t = T since within components overlap between layers tend to result in

the formation of just one cultural zone per cultural component.

The second way will also be easier with multiplexity: since there are now two layers, the

probability that in at least one of these a boundary is dissolved between cultural components is

larger than in the single layer case. In an extreme case, if one layer freezes, the other may still

exhibit dynamics and this in turn can cause the first layer to unfreeze again. From the dynamical

results it follows that this decrease in cultural components is not dominant. This makes sense

since pairs of agents within a cultural component typically have larger cultural overlap initially,

compared to agents in different cultural components, so less interaction has to occur before they

are similar enough to cross the threshold.

Due to the compartmentalization of the generalized Axelrod model, the dynamics take

longer (as was explained in Subsection 5.1). Therefore, convergence is higher because of this as

well. Note, however, that this effect is small, since for M = 2 only 1/q2 of all pairs have links in

both layers and not all of these pairs will have significant cultural overlap. Typically, these three

mechanisms will be associated with longer running times T and more cultural convergence. This

is clearly consistent with the results from the end-state which indicate that T is larger when the

number of layers is larger, T distinguishes clearly between one layer and multiple layers and there

is typically more cultural convergence when M increases.

There is also a force against convergence, namely the fact that if in the end-state two cultural

region clusterings are attained that are non-trivial (in the sense that there are multiple nontrivial

clusters), the cultural domain will roughly be the ‘intersection’ of the two (the probability of

two separate cultural regions with identical cultures is negligible), which will typically result in

more domains. This effect was observed in the end-state results since the singleplex had a higher

CSED than the multiplex with M = 2, so the latter corresponds to a systems with fewer cluster

sizes. Especially at the critical point of the phase transition with respect to ω this effect will be

pronounced since there will most likely be cultural regions of many different sizes (due to some

degree of scale invariance), so if they intersect the domain sizes will be smaller. However, if there

will mostly be one cultural zone per cultural component in both layers, this effect is small. The

effect will be largest when not much dynamics has taken place, but enough so that NDi(ω) < 1

30

for i = 1, 2; most likely this will occur at the threshold that is one unit smaller than the critical

threshold. The results from the end-state on the phase transition confirm that the behavior is

dependent on ω in the same way as described here.

For more than two layers, all effects simply become stronger. For example, there will be much

more overlap within each cultural component if one considers all the pairwise overlaps between the

cultural zones in the respective layers. This was also observed in the end-state results since most

effects were monotone for increasing M .

5.3 Treatment 2: network-culture assortativity

Note that the only difference between this treatment and the previous one is the state at t = 0.

Surely, this has an effect on the end-result, but not on the dynamical mechanisms described in

the previous subsection. Therefore, no dynamic results were analyzed for this case. The most

important finding here is that assortativity largely promotes cultural convergence, but introduces

some non-monotonicity in the effects of M .


The end-state results of the second treatment are exhibited in Figures 5 and 6. Note first that the

case M = 36 is trivial now; this is because pαij is either one or zero, since the cultural subspace is

one-dimensional. If a link forms, no further dynamics can take place since the agents are already

the same in that subspace; if no link forms, there will not be dynamics either. Therefore, there

can be no dynamics and therefore no cultural convergence.

As in treatment 1, 〈ND〉 is smaller for all M , but the ordering becomes less clear for large

M and the differences between consecutive M ’s become smaller as M increases in agreement with

prediction (the first observation most likely implies that the differences are very small). There is

now also nontrivial behaviour for ω = 0.62: the phase transition is less steep for most M , compared

to the first treatment.

Furthermore, SED is similar in size (there is not more uncertainty in the estimate 〈ND〉) and

the ordering in terms of M is the same, but it indicates that that the phase transition shifts to

the left for M > 2. This is in contrast to CSED which indicates that the phase transition shifts

to the left for M > 4 and shifts back for M = 18. However, the intermediate cases do not clearly

distinguish between the two values of ω. These findings show that a lower threshold is needed for

cultural convergence, especially for intermediate M .

Note also that the CSED shows similar ordering as before, but has become larger for most

M . This indicates that fewer small domains are obtained in the end-state. Finally, 〈T 〉 is much

smaller for all M .

31

0

0.2

0.4

0.6

0.8

1

0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74

<N

D>

ω

123469

121836

0

0.005

0.01

0.015

0.02

0.025

0.03

0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74

SE

D

ω

123469

121836



0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74

<C

SE

D>

ω

123469

121836

0

500000

1e+06

1.5e+06

2e+06

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9

<T

>

ω

123469

121836



5.3.2 Discussion

The effect of assortativity is that agents that are linked on the social network are likely to be

similar already, so the system starts in a state where there are fewer cultural zones per cultual

component (increasing the number of useful links each agent has). Furthermore, agents that have

links in multiple layers will typically also have dij < ω, so that the effect of compartmentalization

becomes stronger. Because of these two effects, more cultural convergence will occur, compared to

treatment 1. The result that 〈T 〉 is smaller for all M is intuitive in the sense that less interactions in

the layers have to take place before cultural zones overlap within cultural components, so less time

is needed for the system to reach its end-state. However, compartmentalization would increase 〈T 〉;

apparently, the first mechanism is stronger in terms of the effect on 〈T 〉. Moreover, the fact that

the CSED has become larger is also due to fact that there are fewer cultural zones per cultural

component initially.

32

However, these effects become smaller when M becomes larger, since pαij is dependent on a smaller

subset of features for increasing M , so the correlation between cultural and network links will

be smaller. This may then be partly compensated by the fact that the networks show more

clustering as M increases (see Subsection 3.1.1), which could lead to more cultural convergence,

as was discussed in Subsubsection 2.2.1. This is consistent with the end-state results that for

intermediate M the convergence is largest with respect to treatment 1, since such M are optimal

in the sense of this trade-off.

Finally, an interesting observation is that if there is no BC (i.e. ω = 1) and q is such that

there is one giant component on the network, there is only one cultural zone at the beginning

and cultural convergence is almost surely guaranteed. These conditions are satisfied for all M

except M = 36 (which is trivial anyway), so for ω = 1 the singleplex and multiplex are roughly

the same in terms of cultural convergence since both have one cultural zone from the start. Only

if the threshold decreases will differences appear. Therefore, BC is a necessary counterbalance to

assortativity: assortativity would always lead to cultural convergence if there were no BC.

5.4 Treatment 3: updating networks

In this subsection, the previous treatment is extended by letting each agent update its set of

neighbors after a successful interaction in the associated layer. In a sense, this is just a dynamical

extension of the initial assortativity. Mainly, it is found that the updating mechanism reduces

cultural convergence by decreasing opportunities for interaction with culturally more distant agents

at later times, so that the system settles down faster. In addition, there is some indication of

increased clustering in the layer networks (i.e. more like social networks) as well as correlation

between them over time, although this only occurs for ω such that the system converges partially,

while the layer networks converge to fully connected networks if full cultural convergence is present.


The end-state results are shown in Figures 7 and 8. Note first that for M = 18 the system does

not fully converge for any value of ω, like the case M = 36, but shows some convergence, especially

for larger ω (not shown in the figure). In contrast, it has the largest value of 〈T 〉, indicating that

it needs many time-steps to reach its end-state. This will be explained in more detail below.

In addition, there is less convergence for all values of the threshold compared to treatment

2, which is especially pronounced for ω = 0.68. The ordering in 〈ND〉 is also less clear and

for small values of M the differences between 〈ND〉 at ω = 0.68 are biggest. Whereas first the

phase transition was steepest for large M , now the phase transition is steepest for the singleplex.

This implies that for smaller ω the network dynamics constrain cultural convergence, but when ω

becomes larger the number of cultural zones is already so small that the network dynamics do not

hinder cultural convergence. This is also consistent with the results for 〈T 〉, where the ordering is

33

very different. Before ω = 0.71 the ordering is standard, but starting at this confidence threshold,

the ordering is reversed and higher M roughly implies lower 〈T 〉. This means that for ω ≥ 0.71

network updating may facilitate the Axelrod dynamics on the multiplex. Also note that in general

〈T 〉 is somewhat smaller in the case of updating networks.

Finally, according to SED, ωc = 0.71 for the singleplex and shifts to the left more rapidly.

The values of the SED also vary more in size for different M . However, according to CSED, the

critical confidence ωc is 0.68 for all nontrivial M , except M = 1, for which it is 0.71. Note that for

ω = 0.71, the difference in CSED between ω = 0.68 and ω = 0.71 is very small. In a sense there

is no clear ωc since the phase transition is too steep. In addition, the order seems partly reversed,

with the singleplex having the lowest CSED, while M = 12 has the highest value. It seems that

the network dynamics increase the correlation between cultural zones within cultural components

even more than in the non-updating case, at least for moderate M .

0

0.2

0.4

0.6

0.8

1

0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74

<N

D>

ω

123469

121836

0

0.005

0.01

0.015

0.02

0.025

0.03

0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74

SE

D

ω

123469

121836



0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74

<C

SE

D>

ω

123469

121836

0

500000

1e+06

1.5e+06

2e+06

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9

<T

>

ω

123469

121836



34


Figures 9 and 10 show the dynamical observables (both cultural and network) for a typical run

with M = 2 for both ω = 0.68 and ω = 0.71. The kurtosis of the degree distribution, κ, is divided

by 10 to keep it in the unit interval for most t. Note that from the end-state results, it can be seen

that for M = 2 the difference in end-state is large between ω = 0.68 and ω = 0.71 (although not

as large as in the singleplex case).

0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000 4500

Obs

erva

ble

t (x100)VI(D1;D2)


ND1


0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000

Obs

erva

ble

t (x100)VI(D1;D2)


ND1


Figure 9: variation of information between cultural regions V I(D1, D2), weighted variation of


regions (1) ND1, cultural regions (2) ND2

, cultural components ND(ω), cultural zones (1) ND1(ω)

and cultural zones (2) ND2(ω) on a typical run for (M,ω) = (2, 0.68) (left) and (M,ω) = (2, 0.71)

(right) as a function of t (treatment 3)

0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000 4500

Obs

erva

ble

t (x100)ρ

L1

L2C1

C2κ1 / 10

κ2 / 10

0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000

Obs

erva

ble

t (x100)ρ

L1

L2C1

C2κ1 / 10

κ2 / 10

Figure 10: network correlation ρ, link density (1) L1, link density (2) L2, clustering coefficient

(1) C1, clustering coefficient (2) C2, kurtosis of the degree distribution (1) κ1 and kurtosis of the

degree distribution (2) κ2 on a typical run for (M,ω) = (2, 0.68) (left) and (M,ω) = (2, 0.71)

(right) as a function of t (treatment 3)

35

For both ω = 0.68 and ω = 0.71 it is still true that the overlap mechanism works (since there is a

large WVI which is associated with a decrease in D(ω)i afterwards for i = 1, 2). Clearly, there is

a difference between the two values of ω, since one corresponds to almost full convergence, while

the other is the critical value. Compared to the dynamical results in treatment 1, there is no

fluctuation at all in the Di and D until they decline rapidly at the end. Another difference is that

for i = 1, 2, Di(ω) first increases somewhat after t = 0 before it starts to its decrease. Also note

that at t = 0, the values of the network properties are similar to those of an RG with p = 1/q, as

expected.

The network evolution shows a big qualitative difference between ω = 0.68 and ω = 0.71.

When ω = 0.68 it seems that the network becomes correlated (not only ρ indicates this but also

the remaining measures since they are roughly equal) and the link density stays small, while for

ω = 0.71 both networks become almost fully connected at the end, so it makes sense that ρ is large

(notice that ρ only increases at the end).

When ω = 0.71, Li ≈ Ci for i = 1, 2, so that the network closely resembles an RG for any

time t. Furthermore, note that κi becomes vary large at t = T for both i, since the variation in

the degree distribution diminishes at that point. Finally, the fact that for ω = 0.68 the two layers

have clustering coefficients that are much larger than the corresponding link densities indicates

that the graph is far from a RG and more like an actual social network.

5.4.3 Discussion

To understand the effect of network dynamics, assume first that M = 1. When network assorta-

tivity at t = 0 was introduced, it was observed that more cultural convergence took place for every

ω. This happens because there are fewer cultural zones at t = 0 and therefore it is likely that

there are fewer cultural regions in the end. With the updating rules nothing changes at t = 0 and

one may be inclined to think that the updating mechanism would reduce the number of cultural

zones as time progresses, so that there will be more cultural convergence. However, the updating

mechanism does not just add links in desirable places; it also removes links between nodes that

could have interacted later on, reducing the potential to dissolve boundaries at a later time. In

some cases, then, the updating mechanisms reduces T , thereby promoting cultural diversity.

In terms of cultural zones, the updating mechanism adds two ways of dissolving boundaries,

namely by creating links between zones within different cultural components and by doing the

same within cultural components. The first way is not robust, since no further interaction will

likely take place because of the small cultural overlap, while the second is robust. However, there

is the possibility that such boundaries would have dissolved anyway, over time. In addition, the

updating mechanism diminishes both conventional ways of dissolving boundaries by letting the

system settle down faster. Indeed, the results shown earlier indicate that for M = 1 there is less

cultural convergence for all ω, compared to the previous treatment.

36

Now, when there is multiplexity, the same still holds; for M = 2 it was observed in the dynamical

results that the number of cultural zones first increases before it starts to decrease. Indeed, this

supports the notion that links between agents that are only mildly similar are quickly severed, so

that cultural zones shrink. However, the overall effect presumably is somewhat smaller, since the

overlap within cultural components is relatively robust to the mechanism outlined in the previous

paragraph. For example, if a pair of agents (i, j) only has a link in one layer, hence can interact

and grow closer only in that layer, then the existence of a third node that may indirectly link i

and j is only facilitated by the updating mechanism; moreover, a link may appear between the two

nodes by chance. This is also apparent in the end-state results, which showed that the decease in

cultural convergence at ω = 0.68 is less pronounced for most of the multiplex cases.

As long as there is sufficient correlation between the subcultural distance and the overall

cultural distance and there are multiple layers, the negative effect of the updating mechanism is

somewhat counterbalanced by the positive overlap effect. If there are too many layers, however, the

network dynamics may counteract the ordinary dynamics, since there is little correlation between

the subcultural distance and the overall cultural distance; the probability to interact will almost

be independent of the probability to form a link, so little convergence should be expected. This is

exactly what is seen in the case M = 18

In terms of the layer networks, it was observed that the networks became correlated at a ω

for which the system did not converge fully. This makes sense, since in this case ND(ω) is large at

t = 0. Therefore, the layer networks will have some time to adjust to this clustered structure, so

that as the cultural zones start to become similar in both layers, the network will become similar

too. This does not happen when there are only a few cultural components at t = 0. It also

explains the large and almost equal Ci for both layers since the links will be present mainly within

the cultural components, which means that links between agents cluster within these components.

Both explanations rely heavily on the presence of a large correlation between subcultural distance

and overall cultural distance, so only holds for small M .

5.5 Treatment 4: structured culture

This treatment is based on the structured initial culture, explained in Subsubsection 4.1.3. It is

different from the others not only because the initial culture is such an important determinant for

the end-state behavior, but also because this treatment aims to induce competition in the multiplex

Axelrod dynamics. In the previous subsections overlap between layers was used as a mechanism

within cultural components that increased cultural convergence, by speeding up the dissolution of

boundaries between cultural zones. When mixing is present, there can be competition between

layers since in one layer an agent may be close to some other agent, while in the other layer they

are culturally dissimilar. It then becomes interesting to see whether the two agents will become

culturally identical and what consequences such a dynamic will have for the other agents in the

37

cluster; in randomly generated cultures such large differences are extremely unlikely.

In this treatment only the cases M = 1 and M = 2 are studied. The emphasis will be on

the value of the mixing coefficient µ defined in 4.1.3, since this parameter indicates the effects of

multiplexity given a fixed M : when µ = 0 the effect of multiplexity is small since the subcultural

distance between agents is similar across layers, while the opposite is the case for many pairs of

agents if µ = 1. First, it is found that a third phase is created which becomes less stable when

µ increases and the second phase transition shifts to the left, while becoming steeper. Second,

the effect of µ on cultural convergence depends heavily on the confidence threshold. Third, the

dynamics sometimes show non-equilibrium behavior at the second phase transition, letting one

layer reach its end-state, while the other layer remains active, causing the first layer to become

again active which changes the end-state.


In Figures 11 and 12 the end-state results are shown. It should be noted that in this subsection

we use both 〈SmaxD 〉 and 〈ND〉 as observables since SmaxD distinguishes better between the phases

in this case. (For the other treatments, there was no real difference between the two.)

Starting with M = 1, there are three phases depending on the value of ω. The first is full

cultural divergence and the last is full cultural convergence, which are the same as before. In

between these two is a new phase, however, which consists of two cultural domains of size 50

(〈ND〉 = 2 and 〈SmaxD 〉 = 50). This makes sense since by careful choice of parameters in 4.1.3 there

exist ω’s such that the two shells are generated by the prototypes with a large cultural distance

between them so that two distinct cultural components form for some values of ω at t = 0. Note

that the phase transition between the second and third phase is not so steep, which agrees with

earlier results on realistic initial cultures (see Subsubsection 2.2.2).

These observations are similar for the case M = 2 and µ = 0. Since the mixing coefficient

µ is such that the subshells overlap perfectly accross the two layers, the global structure of the

subcultures is the same in both layers. It still holds that within the two shells the cultural sub-

vectors and network connections are different across layers, so that the original effects of having

multiplexity, as discussed in the other treatments, are present within each shell. However, due to

the fact that the cultural vectors are much closer to each other at t = 0 (1/q + (1− 1/q)b2 = 0.67

instead of 1/q = 0.17), there is much less cultural diversity compared to the random culture, so

that dynamics will be less interesting within the shells. This also agrees with the observation that

〈T 〉 is smaller than for similar M in treatment 3 for all ω.

This starts to change when µ increases from 0 to 1. In the figures, results are shown for

µ = 0.2, 0.4, 0.6, 0.8 and 1.0 (which corresponds to 50, 45, 40, 35, 30 and 25 agents for each

non-mixed shell and 0, 5, 10, 15, 20 and 25 for each mixed shell respectively). It is clear that

the effect of increasing µ has an effect that is highly dependent on ω; for ω < 0.40 there is more

38

convergence in the non-mixed case, while for ω > 0.40 the opposite is true. This is a consequence of

the fact that the second phase shifts to the left relative to the µ = 0 case, becomes less pronounced

(i.e. it exists for fewer ω) and changes quantitatively since the new phase now consists of more

clusters of smaller size. As a result the phase transition from the second to the third phase

becomes somewhat steeper as µ increases. In summary, it depends highly on the confidence level

of the system whether multiplexity increases cultural convergence; for low confidence multiplexity

increases cultural convergence, while for high confidence there is less cultural convergence.

0

0.2

0.4

0.6

0.8

1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Obs

erva

ble

ω

<SmaxD>

<ND><CSE>SDmax

D

0

0.2

0.4

0.6

0.8

1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Obs

erva

ble

ω

<SmaxD>

<ND><CSE>SDmax

D

Figure 11: Average cluster size entropy 〈CSED〉, average number of domains 〈ND〉, average size

of the largest domain 〈SmaxD 〉 and associated standard deviation SEmaxD for M = 1 (left) and

(M,µ) = (2, 1) (right) in the end-state as a function of ω (treatment 4)

0

0.2

0.4

0.6

0.8

1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

<S

max

>

ω

00.20.40.60.8

1

0

20000

40000

60000

80000

100000

120000

140000

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

<T

>

ω

00.20.40.60.8

1

Figure 12: Average size of the largest domain 〈SmaxD 〉 (left) and average number of time steps 〈T 〉

(right) in the end-state as a function of ω with M = 2 for multiple values of µ (treatment 4)

When M = 2 and µ = 1, the new phase consists of four clusters of size 25. This makes sense,

since each cluster corresponds to one of four shells of size 25, each generated by one of the four

prototypes. The cultural distance between mixed shells and non-mixed shells is half the distance

of two non-mixed or mixed shells on average, so the clusters are culturally dissimilar enough for

39

fewer values of ω. In addition, the fact that in each layer the subshells are grouped in two pairs of

two in terms of subprototypes means that there is large connectivity in both layers between each

mixed and each non-mixed shells (although the combination of mixed and non-mixed is different

in each layer), so that if just one pair of agents between them is culturally similar enough, it is

likely that they have a link in one of the layers and can thus interact.

Comparing M = 1 with M = 2 and µ = 1 it is noteworthy that even though the values of

SEmaxD are similar during the second phase transition, SEmaxD > 0 during much of the new phase,

indicating that this phase is unstable. This can also be seen from 〈T 〉; looking at the graph it is

observed that when µ increases, a new peak forms in the middle of the second phase transition

that becomes the global maximum. Furthermore, the fact that 〈CSED〉 > 0 during the new phase

means that there is no ω such that in (almost) every run the end-state distribution is four clusters

of size 25, which lends further support for the instability of this phase.

Finally, note that 〈CSED〉 is positive at the second phase transition for M = 2 and µ = 1,

compared to a zero value for the singleplex. This makes sense, since in the latter case the end-

state will consist of either two clusters of 50 or one cluster of 100, both of which have a CSE of 0.

Importantly, such a situation does not seem to occur when mixing is present.


In Figures 13 and 14 the dynamical results are shown. The confidence threshold ω = 0.40 has no

particular significance for the dynamical behavior, besides the fact that it is part of the second

phase transition. The reason for choosing it is that by chance this run has produced one of the

best examples of non-monotonicity.

This non-monotonicity corresponds to the most important observation, namely D1 first seems

to converge to a state where it has two clusters of size 50, since it remains at this level for quite

some time (i.e. the layer freezes). However, at some point the layer unfreezes and goes back to

an almost full cultural divergent state, before it again starts to converge and this time it ends up

having only one cluster of size 100. (Note that the left plot in Figure 14 shows the size of the

largest cluster as complementary information to the standard cultural observables in Figure 13.)

The cultural regions in layer 2, D2, do not show such behavior: ND2remains constant for some

time in the beginning after which it starts to decrease, although there is some non-monotonicity

before settling in the culturally convergent state. The V I(D1, D2) shows that there is an almost

constant mismatch between D1 and D2.

40

0

0.2

0.4

0.6

0.8

1

0 200 400 600 800 1000 1200 1400

Obs

erva

ble

t (x100)VI(D1;D2)


ND1


Figure 13: Variation of information between cultural regions V I(D1, D2), weighted variation of


regions (1) ND1, cultural regions (2) ND2

, cultural components ND(ω), cultural zones (1) ND1(ω)

and cultural zones (2) ND2(ω) on a typical run for (M,µ, ω) = (2, 1, 0.40) as a function of t

(treatment 4)

0

0.2

0.4

0.6

0.8

1

0 200 400 600 800 1000 1200 1400

Obs

erva

ble

t (x100)Smax

D1(ω)Smax

D2(ω)

SmaxD

SmaxD1

SmaxD2

0

0.2

0.4

0.6

0.8

1

0 200 400 600 800 1000 1200 1400

Obs

erva

ble

t (x100)ρ

L1

L2C1

C2κ1 / 10

κ2 / 10

Figure 14: Size of the largest subcultural component (1) SmaxD1(ω), subcultural component (2) SmaxD2(ω),

cultural domain SmaxD , cultural region (1) SmaxD1and cultural region SmaxD2

(2) (left) and network

correlation ρ, link density (1) L1, link density (2) L2, clustering coefficient (1) C1, clustering

coefficient (2) C2, kurtosis of the degree distribution (1) κ1 and kurtosis of the degree distribution

(2) κ2 (right) on a typical run for (M,µ, ω) = (2, 1, 0.40) as a function of t (treatment 4)

41

In addition, ND1(ω), ND2(ω) and ND(ω) are either 1 or 2, except close to t = 0. In the first half

of the run it is often the case that ND1(ω) = 1 and ND2(ω) = 2 or vice versa, judging from the

constant surges in WV I(D1(ω), D2(ω)). Furthermore, in the left part of Figure 14 sizes of the

largest subcultural components underpin the asymmetry mentioned in the previous paragraph.

Both start out with two clusters of size 50, but in the beginning SmaxD2(ω) jumps to 1, while this only

happens much later for SmaxD1(ω). Interestingly, this starts happening around the same time as layer

1 unfreezes; the fact that the vertical red line is thick means that SmaxD1(ω) jumps quickly between

0.5 and 1, foretelling a complete turnaround in the dynamics.

The network dynamics are somewhat less interesting. First off, the network becomes fully

connected at the end of the model, as was seen before in the case of full cultural convergence.

Similarly, in the second half of the run the properties also indicate that the networks represent

RG’s at every time t, there remains zero correlation until right before the end and the κi behaves

regularly for both i.

In the first half of the run, things are different especially for the first layer. ρ becomes

significantly larger than 0.5 and L1 6= C1; this agrees with the observation that during that time

layer 1 mainly consists of 2 cultural regions, which should have C > L. Finally, the layer networks

are initially similar, but further away from a RG and have L and C much larger than in the case

of a random culture. These features at t = 0 can readily be explained by the structured culture

algorithm. Within the subshells the average distance between subvectors is 0.67, while the average

distance between subvectors from different shells is O = 1/q = 0.17, so that the average subcultural

distance is (0.67 + 0.17)/2 = 0.42, which roughly agrees with L at t = 0. It also makes sense that

C is higher than L, since the algorithm clearly clusters the agents in two subshells.

In terms of network correlation the fully mixed case is furthest from the trivial multiplex discussed

in Subsection 5.1. In the latter case there is perfect correlation between the networks, while in the

former there is no correlation in total (which is similar to the other treatments) and there are huge

differences in correlation across pairs of nodes. For example, for all pairs of nodes within non-mixed

shells there is a relatively large correlation, while there is anti-correlation in the case for mixed

shells. In addition, for links between shells of similar characteristic (i.e. mixed or non-mixed),

there is large correlation, while the opposite is true for links between mixed and non-mixed shells.

Therefore, the fully mixed case can be viewed as having the largest amount of multiplexity of all

cases in all treatments.

5.5.3 Discussion

Starting with M = 1, it was observed in the end-results that there is a new phase and therefore

also a new phase transition. The first phase transition will be when ω becomes large enough so

that some of the agents around each prototype start to interact. In the second phase, the dynamics

will in effect be two separate Axelrod models on separate shells that both converge culturally. The

42

second phase transition will occur when ω becomes so large that some cultural vectors in the two

shells start to interact and the two Axelrod models effectively couple.

The broadness of the second phase transition has to do with the fact that both prototypes

are in some sense attractors for their respective shells, so that there is a tendency for agents to

converge in a particular direction. Concretely, this means that even if there is some interaction

opportunity between agents from different shells (i.e. there is only one cultural zone), there is a

lower probability that this interaction opportunity will be useful, since one of the agents is likely

to have moved in the opposite direction before any interaction can take place. In turn this means

that many such opportunities have to exist (i.e. ω has to be high) before the two will converge with

high probability; for a range of intermediate ω’s both scenarios could occur with the probability

for both changing in favor of full convergence as ω increases, as demonstrated by an increasing

〈SmaxD 〉.

Note that there will always be some links between the two shells at t = 0, although for

small ω the probability that both dij < ω and aij = 1 is small. The updating mechanism of the

network may help linking the closest agents within each shell at any time t, so that this supports

convergence of the two shells. In addition, by assortativity there is a correlation between having

a link and being culturally similar enough to interact (i.e. there typically is one cultural zone

per cultural region for most ω). This effectively ensures that as soon as agents become culturally

similar enough they can start interacting, diminishing the importance of the network as in the

previous treatment.

For M = 2 and µ = 1, the situation is more complicated as was observed in the simulations.

There are now four shells, consisting of the various combinations of subshells; the first phase

transition is the same as before, except that the 100 agents are now distributed equally among the

four shells, so that, during this transition, they start to converge to four cultural regions of size 25.

The second phase transition then starts when interaction between the subshells begins to

occur. Recall that at t = 0 the cultural distance between Gij and Gkl is larger (twice as large on

average) if i 6= k and j 6= l compared to the other cases. Clearly, if agents can interact over this

larger cultural distance, full cultural convergence will surely occur. Therefore, this will not happen

at the second phase transition. This leaves four pairs of shells that could start to interact one of

the layers namely (G11, G12) and (G22, G21) in layer 1 and (G22, G12) and (G11, G21) in layer 2.

(Although their overall cultural similarity is the same across layers, their connectivity is not, since

e.g. G111 and G1

12 together comprise G11 so the corresponding cultural subvectors are very similar.)

Any of these pairs is a priori equally likely (i.e. before the culture is generated) and although

it might be the case that such a pair starts interacting in the other layer, this will happen only

with small probability. Suppose that such a pair begins to interact at t = 0, then after some time

the cultural vectors will have become more similar in that layer so the probability of interaction

becomes larger in the other layer (there are always some cultural subvectors with positive overlap

43

and therefore a relatively large probability of having a link), so that in the end the two shells will

become culturally identical.

The argument in the previous paragraph concerned two shells in isolation. Typically, if only

one of the pairs starts interacting at t = 0, then the end-state will consist of two cultural domains

of size 25 and one of size 50. If two pairs in the same layer start to interact, the system most likely

will end up with two cultural domains of size 50. By similar arguments, if two pairs in different

layers start to interact, the result will be one domain of size 75 and one of size 25. If more than

two pairs start interacting full convergence (i.e. one domain of size 100) is almost assured. Note

that two of these four combinations imply positive CSED so this explains some of the observations

in the end-state results. (Also, when running the model a few times for several values of ω in the

second phase transition, it was seen that the system always ended up in one of these four states,

in addition to the unstable phase with four domains of size 25.) The second phase transition,

therefore, constitutes a prolonged regime (in terms of ω) where the system can end up in a wide

variety of states. In a sense, this is symmetry breaking.

When three pairs of shells interact, interesting behavior can occur. Note that in such a

situation two pairs in one layer (layer 1, say) interact and one pair in the other (layer 2). Typically,

as a result of the dynamics in layer 1 the two subshells in layer 2 will also grow closer together

over time, since the shells that correspond to the same subshell in one layer correspond to two

different subshells in the other. If at t = 0 one of the pairs in layer 2 starts to interact, this means

that after some time layer 2 will move towards one cultural region of size 100, since there will

be interaction between all four shells in that layer (although the timing is different). However,

before this happens layer 1 may have already reached a frozen state with two cultural regions of

size 50. In addition, this convergence typically also implies that the two subshells from which the

cultural regions originate grow further apart, as measured by dmin(see Subsubsection 4.1.3), which

hampers the interaction between the original pair that started interacting in layer 2. Essentially,

this constitutes competition between the layers. In the end, layer 2 will reach full convergence,

which will cause layer 1 to unfreeze and converge fully as well.

This is exactly the dynamical behavior that was observed in the previous subsubsection. The

first layer is seen to converge to the the state with two cultural regions of size 50, while increasing

cultural similarity between the shells which leads to a single subcultural component in layer 2.

As the second layer moves towards a state with two cultural regions of size 25 and one of size

50, severe competition results in a single subcultural component in layer 1, diminished spikes in

WV I(D1(ω), D2(ω)) (spikes mostly indicate in this case that ND1(ω) = 2 which corresponds to

freezing since ND1= 2, while ND2(ω) = 1) and finally in the unfreezing of layer 1.

Note that if all four pairs started to interact in the beginning, there would be no asymmetry

between the layers and presumably no severe non-monotonicity since one layer would not have the

time to converge to a partially convergent state before it would unfreeze again.

44

In terms of the behavior observed in the dynamical results, now the Axelrod dynamics cannot

really be seen as a relaxation process anymore. In contrast, the process first moves towards

equilibrium after it moves away again. In this case the system settles down in the end, but in more

general cases with many layers and various degrees of mixing between all layers, the system may

never really settle down, which also agrees with observation in the real world. In summary, this

shows that multiplexity induces nonequilibrium behavior.

6 Conclusion

In this thesis a model of cultural dynamics was studied that incorporates many realistic features. It

was investigated what the effect of social multiplexity is on cultural dynamics, using a generalized

version of the Axelrod model, in addition to the effect of cultural evolution on social networks.

The effect of multiplexity differed somewhat for the different treatments, but in general multi-

plexity promoted cultural convergence. In a sense, then, the fact that there are more links in total

in all the layers results in more cultural convergence. Of course, multiplex links are not the same

as singleplex links, since the latter consists of compressing all the layers into one. Another general

feature of multiplexity that was observed is a form of competition between layers, resulting from

their coupling. In terms of the layer networks, in most cases the end result was something that

resembles a RG, so that they did not give a good description of social networks.

In the first treatment, multiplexity had a positive effect on cultural convergence, while in the

next treatment, more complex relationships were observed due to network assortativity. When net-

work updating was present in the third treatment, cultural convergence was reduces and interesting

network dynamics observed.

In the case of structured initial cultures more diverse behavior was shown with an additional

phase between full cultural divergence and full cultural convergence. This phase turned out to be

unstable when enough mixing was present. In the second phase transition nonequilibrium behavior

in the dynamics was shown to be the result from competition between the layers.

One interesting feature that was not discussed in this thesis, is the effect of having a more

complicated relationship between features and layers. For example, a cultural feature could be

associated to multiple layers. This will be left for future work. In addition, the presence of ordinal

cultural features would be a realistic extension of the current work. Finally, one may consider a

different function f for generating networks from the initial culture and to updating them over

time; perhaps it is possible to choose f such that the resulting networks are more realistic given

an initial culture.

Multiplex, adaptive complex systems present one of the greatest challenges to the complex

systems paradigm. This thesis shows what astonishing behavior can be exhibited by such high-

dimensional objects. Looking at the non-equilibrium behavior observed for structured cultures

with mixing, one may only wonder about the range of possibilities for real cultural systems.

45

References

[Albert and Barabasi, 2002] Albert, A. and Barabasi, A. (2002). Statistical mechanics of complex

networks. Reviews of Modern Physics, 74:47–97.

[Axelrod, 1997] Axelrod, R. (1997). The dissemination of culture: A model with local convergence

and global polarization. Journal of Conflict Resolution, 41(2):203–226.

[Barrat et al., 2008] Barrat, A., Barthelemy, M., and Vespignani, A. (2008). Dynamical processes

on complex networks. Cambridge University Press, Cambridge.

[Boccaletti et al., 2014] Boccaletti, S., Bianconi, G., Criado, R., Del Genio, C., Gomez-Gardenes,

J., Romance, M., Sendina-Nadal, I., Wang, Z., and Zanin, M. (2014). The structure and dy-

namics of multilayer networks. Physics Reports, 544(1):1–122.

[Buldyrev et al., 2010] Buldyrev, S., Parshani, R., Paul, G., Stanley, H., and Havlin, S. (2010).

Catastrophic cascade of failures in interdependent networks. Nature, 464:1025–1028.

[Castellano et al., 2009] Castellano, C., Fortunato, S., and Loreto, V. (2009). Statistical physics

of social dynamics. Reviews of Modern Physics, 81:591.

[Castellano et al., 2000] Castellano, C., Marsili, M., and Vespignani, A. (2000). Nonequilibrium

phase transition in a model for social influence. Physical Review Letters, 85(16):3536.

[Centola et al., 2007] Centola, C., Gonzalez-Avella, J., Eguiluz, V., and San Miguel, M. (2007).

Homophily, cultural drift, and the co-evolution of cultural groups. Journal of Conflict Resolution,

51(6):905–929.

[Clifford and Sudbury, 1973] Clifford, P. and Sudbury, A. (1973). A model for spatial conflict.

Biometrika, 60(3):581–588.

[Cozzo et al., 2013] Cozzo, E., Banos, R., and Meloni, S. amd Moreno, Y. (2013). Contact-based

social contagion in multiplex networks. Physical Review E, 88:050801 (R).

[De Sanctis and Galla, 2009] De Sanctis, L. and Galla, T. (2009). homophily noise and confidence

thresholds in nominal and metric axelrod dynamics of social influence. Physical Review E,

79:046108.

[Deffuant et al., 2000] Deffuant, G., Neau, D., Amblard, F., and Weisbuch, G. (2000). Mixing

beliefs among interacting agents. Advances in Complex Systems, 3(4):87.

[Flache and Macy, 2008] Flache, A. and Macy, M. (2008). Local convergence and global diversity:

The robustness of cultural homophily. arXiv:0808.2710.

46

[Gandica et al., 2011] Gandica, Y., Charmell, A., Villegas-Febres, J., and Bonalde, I. (2011). Clus-

ter size entropy in the axelrod model of social influence: small-world networks and mass media.

Physical Review, 84:046109.

[Gonzalez-Avella et al., 2005] Gonzalez-Avella, J., Cosenza, M., and Tucci, K. (2005). Nonequi-

librium transition induced by mass media in a model for social influence. Physical Review E,

72(6):065102.

[Guerra et al., 2010] Guerra, B., Poncela, J., Gomez-Gardenes, J., Latora, V., and Moreno, Y.

(2010). Dynamical organization towards consensus in the axelrod model on complex networks.

Physical Review E, 81:056105.

[Huang and Liu, 2010] Huang, L. and Liu, J. (2010). Characterizing multiplex social dynamics

with autonomy oriented computing. Lecture Notes in Computer Science, 6329:277–287.

[Klemm et al., 2003a] Klemm, K., Eguiluz, V., Toral, R., and San Miguel, M. (2003a). Global

culture: A noise induced transition in finite systems. Physica Review E, 67(4):045101(R).

[Klemm et al., 2003b] Klemm, K., Eguiluz, V., Toral, R., and San Miguel, M. (2003b). Nonequi-

librium transitions in complex networks: A model of social interaction. Physical Review E,

67(2):026120.

[Klemm et al., 2003c] Klemm, K., Eguiluz, V., Toral, R., and San Miguel, M. (2003c). Role of

dimensionality in axelrod’s model for the dissemination of culture. Physica A, 327(1-2):1.

[Klemm et al., 2005] Klemm, K., Eguiluz, V., Toral, R., and San Miguel, M. (2005). Globalization,

polarization and cultural drift. Journal of Economic Dynamics & Control, 29:321–334.

[Lanchier, 2012] Lanchier, N. (2012). The axelrod model for the dissemination of culture revisited.

The Annals of Applied Probability, 22(2):860–880.

[McConnell, 2011] McConnell, A. (2011). The multiple self-aspects framework: self-concept rep-

resentation and its implications. Personality and Social Psychology Review, 15(1):3–27.

[Meila, 2003] Meila, A. (2003). Comparing clusterings by the variation of information. Learning

Theory and Kernel Machines, 2777:173–187.

[Palchykov et al., 2014] Palchykov, V., Kaski, K., and Kertesz, J. (2014). Transmission of cultural

traits in layered ego-centric networks. Condensed Matter Physics, 17(3):1–10.

[Pfau et al., 2013] Pfau, J., Kirley, M., and Kashima, Y. (2013). The co-evolution of cultures,

social network communities, and agent locations in an extension ox axelrod’s model of cultural

dissemination. Physica A, 392:381–391.

47

[Quattrociocchi et al., 2014] Quattrociocchi, W., Caldarelli, G., and Scala, A. (2014). Opinion

dynamics on interacting networks: media competition and social influence. Scientific Reports,

4(4938).

[Stivala et al., 2014] Stivala, A., Robines, G., Kashima, Y., and Kirley, M. (2014). Ultrametric

distribution of culture vectors in an extended axelrod model of cultural dissemination. Scientific

Reports, 4(4870).

[Valori et al., 2011] Valori, L., Picciolo, F., Allansdottir, A., and Garlaschelli, D. (2011). Recon-

ciling long-term cultural diversity and short-term collective social behavior. Proceedings of the

National Academy of Sciences of the United States of America, 109(4):1068–1073.

[Vazquez and Redner, 2007] Vazquez, F. and Redner, S. (2007). Non-monotonicity and divergent

time scales in axelrod model dynamics. Europhysics Letters, 78(1):18002.

[Vilone et al., 2002] Vilone, D., Vespignani, A., and Castellano, C. (2002). Ordering phase transi-

tion in the one-dimensional axelrod model. The European Physical Journal B, 30:399.

48