The Structure of Information Pathways in a Social Communication Network
-
Upload
wolfgang-richard -
Category
Documents
-
view
24 -
download
0
description
Transcript of The Structure of Information Pathways in a Social Communication Network
![Page 1: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/1.jpg)
The Structure of Information Pathways in a Social
Communication Network
Presented By: Under the guidance of: Tingting Xu Augustin Chainterau
![Page 2: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/2.jpg)
Paper Objective
Study the temporal dynamics of communication using on-line data
Give temporal notion of ‘distance’ and ‘vector – clocks’ to formulate a temporal measure which will provide structural insights
Define the network backbone to be the sub-graph consisting of edges on which information has the potential to flow the quickest
![Page 3: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/3.jpg)
Why Construct New Model
Discrete communication distributed non-uniformly over time
Direct and indirect flow of information
Discussion about recent research - has studied communication of an event-driven nature
The properties of systemic communication arguably determine much about the rate at which people in the network remain up-to-date on information about each other
![Page 4: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/4.jpg)
The Present Work
Systemic communication and information pathways
Propose a framework for analyzing systemic communication based on inferring structural measures from the potential for information to flow between different nodes
Out-of-date information
Indirect paths – triangle-inequality violation
![Page 5: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/5.jpg)
The Present Work
Data used here have complete histories of communication events over long periods of time
Main datasets - complete set of anonymized e-mail logs among all faculty and staff at a large university over two years
Enron e-mail corpus The complete set of user-talk communications among admins
and high-volume editors on Wikipedia
Vector clocks introduced by Lamport and refined by Mattern
Network backbone
![Page 6: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/6.jpg)
Vector Clocks and Latency
Communication skeleton G
The latest view that v has of u at time t is denoted by
Define for all v and t
, refer as the vector clock of v at time t
Information latency is denoted by t -
An algorithm to compute the vector clocks for all nodes at all time in [0, T]
![Page 7: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/7.jpg)
Latencies in Social Network Data
Consider only messages with at most c (ranging from 1 and 5) recipients
Focus on q-fraction of active e-mal users (Here q = 0.20)
For a time difference τ , we define the ball of radius τ around node v at time t, denoted Bτ (v, t), to be the set of all nodes whose latency with respect to v at time t is ≤ τ days.
For fixed t, the distribution of ball-sizes over nodes can be studied using a function ft(τ ), defined as the median value of |Bτ (v, t)| over all v
![Page 8: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/8.jpg)
Open Worlds vs. Closed Worlds
Boundary specification problem – value of q-fraction [0, 1]
![Page 9: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/9.jpg)
Quantifying the Strength of Weak Ties
The range of an edge , defined to be the unweighted shortest-path distance in the social network between and if were deleted
Edges of range greater than two are generally weak ties
Vector-clock analysis can provide evidence for the phenomenon that weak ties are the sources of important information to their endpoints
Define advance in ’s clock to be the sum of coordinatewise differences between before the update from and after the update from
![Page 10: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/10.jpg)
Backbone Structures
Instantaneous Backbones
Define the backbone Ht at time t to be the graph on whose edge set is the collection of edges from G that are essential at time t.
An edge is essential if ’s most up-to-date view of is the result of direct communication from
Here the backbones Ht at fixed times t as instantaneous backbones, by contrast with the aggregate backbone which is based on an aggregate construction that takes all times into account.
![Page 11: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/11.jpg)
Backbone Structures
An aggregate Backbone
For each edge in the communication skeleton G such that has sent ρv, w > 0 messages to over the full time interval [0, T], define the delay δv, w of the edge to be T/ ρv, w
The weighted graph Gδ obtained from the communication skeleton G by assigning a weight of δv, w to each edge
An edge in Gδ is essential if it forms the minimum-delay path between its two endpoints
Define the aggregate backbone H* to be the sub-graph of Gδ consisting only of essential edges
![Page 12: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/12.jpg)
Backbone Structures
How to construct the aggregate Backbone H*
Compute a weighted shortest-paths tree rooted at each node of Gδ , using the delays as weights
The union of the edges in all these trees will be H*, by the following proposition
PROPOSITION An edge belongs to H* if and only if it lies on the minimum-delay path between some pair of nodes and
PROOF
![Page 13: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/13.jpg)
Backbone Structures
How to construct the aggregate Backbone H*
Compute a weighted shortest-paths tree rooted at each node of Gδ , using the delays as weights
The union of the edges in all these trees will be H*, by the following proposition
PROPOSITION An edge belongs to H* if and only if it lies on the minimum-delay path between some pair of nodes and
![Page 14: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/14.jpg)
Backbone Structures
Density and node degrees of the backbone
The backbone Ht and the aggregate backbone H* are surprisingly sparse related to a fairly dense communication skeleton G
This in other words, from the point of view of potential information flow, a significant majority of all edges in the social network are bypassed by faster indirected paths
![Page 15: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/15.jpg)
Backbone Structures
Density and node degrees of the backbone
Considering the backbone also sheds further light on the role of high-degree nodes in the social network
High-degree nodes in the full communication skeleton G indeed have many incident edges in the aggregate backbone
However, the fraction of a node’s edges that are declared essential strictly decreases with degree.
![Page 16: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/16.jpg)
Backbone Structures
Structure of the backbone
The backbone is trying to balance two competing objectives
Representing long range edges (recall definition of ‘range’)
Representing edges have high embeddedness and transmit information at short ranges over quick time scales
Define embeddedness of an edge to be the fraction of its endpoints’ neighbors that are common to both
For an edge , let and denote the sets of neighbors of the endpoints and respectively. Define the embeddedness of to be / | |
The backbone balances between two qualitatively different kinds of information flow
![Page 17: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/17.jpg)
Varying Speed of Communication
Study what happens to information latencies (i.e. t - ) when each node varies the relative rates of its communication
Given a directed graph G, with a total rate for each node
Given a target set of nodes in G Each node chooses a rate at which to communicate to each of its
neighbors , subject to the constraint that
Define delays , where T is value of the time interval Question here is that: for a given bound , can we choose rates for each
node so that the median shortest-path delay between pairs in in the aggregate backbone is at most
![Page 18: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/18.jpg)
Varying Speed of Communication
THEOREM The delay minimization problem defined above is NP – complete
Sketch of the proof of this theorem is in the paper
Consider simple local rules by which individuals in a network might vary rates of communication so as to influence the potential for information flow
![Page 19: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/19.jpg)
Load-leveling vs. Load-concentrating
For accelerating potential information flow
Talk even more actively to one’s most frequent contacts Load-concentrating with > 1
or balance things out by increasing communication with the less frequent contacts? Load-leveling with < 1
Rescaling exponent , changing the communication rate to and then normalizing all rates from to keep its total outgoing message volume the same
![Page 20: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/20.jpg)
Load-leveling vs. Load-concentrating
Extend the notion of delay to node-dependent delays which will have also a fixed delay of at each node
Total delay on a path becomes the sum of edges and node delays
As increases, there is a larger penalty for more-hop paths
The value of at which network latency is optimized decreases with * = 1 at days
The backbone becomes denser and the importance of quick indirect paths diminishes
![Page 21: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/21.jpg)
Conclusions (I)
Make integral use of information about how nodes communicate over time
Develop structural measures based on the potential for information to flow
The sparse sub-graph of edges most essential to keeping people up-to-date – the backbone of the network – provides important structural insights that relate to embeddedness, the role of high-degree(i.e. hubs), and the strength of weak ties
Studied the effects on information flow as nodes vary the rate at which they communicate with others in the network using different strategies
![Page 22: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/22.jpg)
Conclusions (II)
Discussions in other two datasets
The situations in sparsity of the aggregate and instantaneous backbones and the variation in node degrees are similar
Difference - the ‘core’ of active communicators is much smaller in both the Enron corpus and in Wikipedia, this makes the range of an edge in the unweighted communication skeleton harder to interpret and to correlate with other measures
Further investigation the principles that govern the dynamics of different types of information how these principles interact with the directed, weighted nature of social
communication networks
![Page 23: The Structure of Information Pathways in a Social Communication Network](https://reader035.fdocuments.us/reader035/viewer/2022070401/568135db550346895d9d4eab/html5/thumbnails/23.jpg)
Thank You