Download - Social and Economic Networks: Lecture 2, Representing Networks …homes.ieu.edu.tr/~aduman/econ430/econ430Lec2.pdf · Social and Economic Networks: Lecture 2, Representing Networks

Social and Economic Networks: Lecture 2,

Representing Networks

Alper DumanIzmir University Economics,

March 5, 2013

Alper Duman Izmir University Economics, Social and Economic Networks: Lecture 2, Representing Networks

Measuring Networks

I A graph (N , g) consists of a set of nodesN = {1, 2, .., n} and a real-valued matrix g , where gijrepresents the relation between i and j

I This is called adjacency matrix.

I If all gij entries are either 0 or 1, the graph is calledunweighted ; otherwise it is a weighted graph/network.

I If all gij = gji then the graph is undirected ; otherwise it isa directed graph/network.


For N = {1, 2, 3} the adjacency matrix is

g =

0 1 01 0 10 1 0

Draw the network given the above adjacency matrix gThe same network can be represented by an edgelist;g = {{1, 2}, {2, 3}}How many nodes and edges does g has?


Paths and Cycles

I In order to capture indirect interactions in a network it isessential to model paths through the network.

I A path in a network g ∈ G (N) between nodes i and j is asequence of links i1i2, i2i3, ...., iK−1iK such that ik ik+1 ∈ gfor each k ∈ {1, 2, ....,K − 1}, with ii = 1 and iK = j andsuch that each node in the sequence i1, ...iK is distinct.

I A walk in a network g ∈ G (N) between nodes i and j is asequence of links i1i2, i2i3, ...., iK−1iK such that ik ik+1 ∈ gfor each k ∈ {1, 2, ....,K − 1}, with ii = 1 and iK = j


I A cycle is a walk that starts and ends at the same node,so that the only node that appears twice is thestarting/ending node.

I A cycle can be constructed from any path by adding alink from the end to the starting node

I A geodesic between nodes i and j is a shortest pathbetween these nodes.


Let gii = 0, then gL gives us how many walks there are oflength L between any nodes.For example, find how many walks of length 2 there arebetween node 1 and 3 given the below adjacency matrix g

g =

0 1 1 01 0 0 11 0 0 10 1 1 0


Components and Connected Subgraphs

I A network (N , g) is connected if every two nodes in thenetwork are connected by some path in the network.

I A component of a network (N , g) is a non-emptysubnetwork (N ′, g) such that ∅ 6= N ′ ⊂ N , g ′ ⊂ g , and(N ′, g ′) is connected and if i ⊂ N ′ and ij ⊂ g then j ⊂ N ′

and ij ⊂ g ′.

I Thus the components of a network are the distinctmaximal connected subgraphs of a network.

I The set of components of a network (N , g) is denotedC (N , g).


I A network is connected if and only if it consists of asingle component; Π(N , g) = N .

I Components of a network partition the nodes into groupswithin which nodes are path-connected.

I A link ij is a bridge in the network of g if g − ij has morecomponents than g .

I For a directed network, directions are important. If allnodes of a subgraph is connected through directed links,then that subgraph is a strongly connected component.


Trees, Stars, Circles and Complete Networks

I A tree is a network that has no cycles. (Families are trees)

I A forest is a network such that each component is a tree.

I A star is a network in which there exists some node isuch that every link in the network involves node i .

I There is only one center node i in a star.


I A complete network is one in which all possible links arepresent so that gij = 1 for all i 6= j .

I A circle is a network that has a single cycle and is suchthat each node in the network has exactly two neighbours.

I The neighbourhood of a node i is the set of nodes that iis linked to; Ni(g) = j : gij = 1.

I The degree of a node is the number of links that involvesthat node, which is the cardinal measure of the node’sneighbourhood; di(g) = NumberNi(g).


I For directed networks there are in-degree and out-degreemeasures; above measure would be the in-degree.

I The density of a network measures the relative fraction oflinks present over all possible links; that is average degreedivided by n − 1.

I What is the density of a complete network?

I What is the density of a star network with 5 nodes?


Summary Statistics of Networks

I A degree distribution of a network is a description ofnodes that have different number of degrees.

I P(d) is the fraction of nodes that have degree d under adegree distribution P .

I What will be the degree distribution of a completenetwork with k links per node?


I A scale-free distribution satisfies, P(d) = cd−γ

I Why is it called scale-free?

I Best way to capture a scale-free distribution is to takelogs.

I log(f (d)) = log(c)− γlog(d)


I The diameter of a network is the largest distance betweenany two nodes in the network.

I What is the diameter of a circle graph with n nodes?

I What is the diameter of a binary tree of n nodes?

I 2log2(n + 1)− 2...how does the answer relate to theguessing game?

I Average path length is the average of all geodesics(shortest paths among any two nodes).


I Friends of my friends are more likely to be friends witheach other.

I A clique is a maximal completely connected subgraph of agiven network.

I A node can belong to more than one clique; give anexample.

I Clique structure is very sensitive to slight changes in anetwork.


I Overall clustering coefficient Cl(g) measures the ratio ofcompleted triads to all possible triads in a network.

I Individual clustering coefficient Cli(g) does this for eachnode.

I The average clustering coefficient ClAvg (g) then takes theaverage of all individual clustering coefficients;ClAvg (g) = ΣCli(g)/n

I Overall clustering coefficient and average clusteringcoefficient can differ considerably.

I For directed networks percentage of transitive triples,ClTT (g), measures the ratio of actual directional tripletsoriginating the source node to potential ones


Centrality

Measures of centrality can be grouped into four:

1. Degree -how connected a node is

2. Closeness -how easily a node can reach other nodes

3. Betweenness -how important a node is in terms ofconnecting other nodes

4. Neighbours’ characteristics -how important, central orinfluential a node’s neighbours are


I The degree centrality of a node is di(g)/(n − 1)

I The basic closeness centrality is the inverse of the averagedistance between i and any other node j : that is(n − 1)/

∑i 6=j L(i , j)

I The decay centrality which takes the farther distanceswith a decreasing weight;

∑i 6=j δ

L(i ,j)

I What is the link between the decay centrality and thestrategic connection model?


I The betweenness centrality of a node isCeBi (g) =

∑ Pi (kj)/P(kj)(n−1)(n−2)/2

I Freeman suggested this measure.

I Can you complete your closeness centrality in yourFacebook network?


I Given the edgelist representation of a simple undirectednetwork G = 12, 13, 23, 34, 45, 56, 57, 67, write downthe adjacency matrix of the network.

I Draw the network

I Calculate degree centralities of nodes 4 and 3 and 2.

I Calculate closeness centralities of nodes 4 and 3 and 2.

I Calculate betweenness centralities of 4 and 3 and 2.


Prestige, Power and Eigenvector Centrality

Measures

I Katz prestige centrality

I Eigenvector centrality (or Bonacich centrality)

I Both Katz prestige and eigenvector centrality aremeasures depending on iterative calculations on matrices.

I Learn about invertible matrices!


Google is based on a network centrality idea

I The name ”PageRank” is a trademark of Google, and thePageRank process has been patented (U.S. Patent6,285,999).

I However, the patent is assigned to Stanford Universityand not to Google. Google has exclusive license rights onthe patent from Stanford University.

I The university received 1.8 million shares of Google inexchange for use of the patent; the shares were sold in2005 for $336 million.

I Brin, S. and Page, L. (1998) The Anatomy of aLarge-Scale Hypertextual Web Search Engine. In:Seventh International World-Wide Web Conference(WWW 1998), April 14-18, 1998, Brisbane, Australia.

I PageRank centrality is a modified version of eigenvectorcentrality!