Basic Network Concepts - Bilgisayar Mühendisliği...
Transcript of Basic Network Concepts - Bilgisayar Mühendisliği...
Basic Network Concepts
Alice
Bob
Chuck Nodes
Vertices
Edges
Links
Graph
Network
Basic Vocabulary
Alice
Bob
Chuck
Edges
Alice
Bob
Chuck
Edge Weights
Apollo 13 Movie Network
Main Actors in Apollo 13 the Movie:
Tom Hanks
Kevin Bacon
Gary Sinise
Bill Paxton
Ed Harris
Actors are nodes. Edges connect actors who were in a movie together.
Since all were in Apollo 13, this is not interesting. Let’s make a new network that connects them if they were in an additional movie together.
Tom
Hanks
Kevin
Bacon
Ed
Harris
Bill
Paxton
Gary
Sinise
Magnificent Desolation:
Walking on the Moon
Magnificent Desolation:
Walking on the Moon
The Green Mile
Beyond All Boundaries
The Human
Stain
Apollo 13 Movie Network
Tom
Hanks
Kevin
Bacon
Ed
Harris
Bill
Paxton
Gary
Sinise
Directed or Undirected?
TH,BP
TH,GS
BP,GS
GS,KB
GS,EH
Tom
Hanks
Kevin
Bacon
Ed
Harris
Bill
Paxton
Gary
Sinise
Adjacency List
TH BP GS EH KB
TH 1 2
BP 1 1
GS 2 1 1 1
EH 1
KB 1
Tom
Hanks
Kevin
Bacon
Ed
Harris
Bill
Paxton
Gary
Sinise
Adjacency Matrix
A
C
D
F
H B
G
E
Shortest Path Length and Cliques
Tom
Hanks
Kevin
Bacon
Ed
Harris
Bill
Paxton
Gary
Sinise
Cliques
Connectedness
Two nodes are connected if there is a path between them.
A graph is connected if there is a path between every pair of nodes.
In a directed graph, it is strongly connected if there is a directed path between each pair. It is weakly connected if there is a path between every pair if direction is ignored.
R
K
J
H G
F
N
M
L
I
O
P
Q A
B C
D E
Hubs and Bridges
Clusters
A cluster is a group of nodes that are tightly connected
“tightly” varies, but usually means they are more tightly connected than the network as a whole
Does not need to be a clique
Group in the lower left of previous graph is a cluster
O
P
Q A
B C
D E
Subnetworks
Q A
B C
D E
Egocentric Networks
Q A
B C
D E
Egocentric Networks
Network Structure
Tom Hanks
Kevin
Bacon
Ed Harris
Bill Paxton
Gary Sinise
Degrees
Tom Hanks
Bill Paxton
Gary Sinise
Kevin Bacon
Ed Harris
2
2
4
1
1 Degree Distribution
0
0.5
1
1.5
2
2.5
1 2 3 4
Degree
Degree Distribution
3
2
3
4
3 3
4
2
Degree Distribution
0
1
2
3
4
5
1 2 3 4
Degree
Degree Distribution
Edges: 5
Tom Hanks
Kevin
Bacon
Ed Harris
Bill Paxton
Gary Sinise
Total Possible Edges: 10 Density: 5/10 = 0.5
Density
Nodes: 8
Edges: 12
Total Possible Edges: ??
# Nodes * (# Nodes -1)
2
(8*7)/2 = 56/2 = 28
Density: 12/28 = 0.43
Density
Clustering Coefficient
Density of a node’s 1.5 degree egocentric network (with the node itself excluded) is called its clustering coefficient.
An important measure we will see later on.
R
K
J H
G
F N
M
L
I
B
C
D
A
Q
P
O E
Which Node is Most Important?
R
K
J
H G
F
N
M
L
I
O
P
Q A
B C
D E
Which Node is Most Important?
R
J
H G
F
I
O
P
Q
C
D
P 1
O 1
D 3
C 2
Q 2
R 1
G 2
J 2
I 2
H 2
=18/10=1.8
Closeness Centrality
R
J
H G
F
I
O
P
Q
C
D
P 2
O 2
D 4
C 3
Q 3
F 1
G 1
J 1
I 1
H 1
=19/10 =1.9
Closeness Centrality
R
J
H G
F
I
O
P
Q
C
D
F 1
O 1
D 2
C 2
Q 1
R 2
G 3
J 3
I 3
H 3
=21/10 =2.1
Closeness Centrality
R
J
H G
F
I
O
P
Q
C
D
F 1
O 1
D 2
C 2
Q 1
R 2
G 3
J 3
I 3
H 3
=21/10 =2.1
Closeness Centrality
Degree Centrality
R=9
F=3
D=5
B=4
R
K
J
H G
F
N
M
L
I
O
P
Q A
B C
D E
Betweenness Centrality
Measure of a node’s influence
Percentage of shortest paths that include a given node
Betweenness Centrality
C
A
B
H
D
E
G
F
Eigenvector Centrality
Measure of a node’s importance
Iterative matrix computation that gives more weight to nodes if they are connected to influential nodes.
The backbone to techniques like Google’s PageRank which ranks web pages.
Connectivity and Cohesion
Minimum number of nodes to remove before the network becomes disconnected.
C
A B
H
D
E
G
F
Cohesion=1
B
E
F D A
C G
Cohesion=2
Network
Exercise
Open an assigned network in a network analysis tool like Gephi or NodeXL.
Run statistics to compute centrality.
Compare different centrality measures.
Visual Analysis of Networks
What is interesting about this network?
What makes a good visualization?
Every node is visible
For every node you can count its degree
For every link you can follow it from source to destination
Clusters and outliers are identifiable
(Dunne and Shneiderman, 2009)
We can’t always do all of this, but it’s a start
Is this a good visualization?
What about this one?
And this one?
And finally, this one?
Node Size and Color
Node Size and Color
Edge Weight
Visualization Issues
Scale
Too many nodes (~10,000 or more) or edges are almost impossible to visualize
Dense networks may not reveal patterns
Example: Senate Voting Records
Filtering
Visualization Tools
Gephi
All platforms
Stand alone program
NodeXL
Windows only
Plugin for Microsoft Excel
In Class Exercise
Load an assigned dataset into NodeXL or Gephi
Create a visualization, using color, size, layout, and other features to tell a story or provide an insight into the data.
Post your final visualization in a shared space
Tie Strength
Strong vs. Weak Ties
Strong ties
Trusted
Close friends and family
Weak ties
Often part of other social circles
Acquaintances, co-workers
We talk about “strong” or “weak” ties, but in reality, there is a continuous spectrum
Mark Granovetter
Foundational work in 1973, “The Strength of Weak Ties”
Strong ties had been considered most important
His work showed weak ties mattered
Getting a Job
Carl Y. was doing commission sales for an encyclopedia firm, but was not doing well. He decided he would have to find a different job; meanwhile, he started driving a cab to bring in extra money. One passenger asked to be taken to the train station where he had to meet a friend. This friend turned out to be an old friend of Carl Y.'s, and asked him "what're you doing driving a cab?" When Mr. Y. explained, the friend offered him the job he now holds—labor relations manager for a small company, owned by his friend. (Granovetter, 1974, p34)
Granovetter, M. 1974. Getting a Job: A Study of Contacts and Careers..
Getting a Job
George C. was working as a technician for an electrical firm, with a salary of about $8000, and little apparent chance for advancement. While courting his future wife, he met her downstairs neighbor, the manager of a candy shop, a concession leased from a national chain. After they were married, Mr. C. continued to see him when visiting his mother-in-law. The neighbor finally talked him into entering a trainee program for the chain, and arranged an interview for him. Within three years, Mr. C. was earning nearly $30,000 in this business. (Granovetter, 1974, p. 49).
Granovetter, M. 1974. Getting a Job: A Study of Contacts and Careers..
Getting a Job
Edward A., during high school, went to a party given by a girl he knew. There, he met her older sister's boyfriend, who was ten years older than himself. Three years later, when he had just gotten out of the service, he ran into him in a local hangout. In conversation, the boyfriend mentioned to Mr. A. that his company had an opening for a draftsman. Mr. A. applied for this job and was hired. (Granovetter, 1974, p. 76)
Granovetter, M. 1974. Getting a Job: A Study of Contacts and Careers..
Replicating Milgram’s Six Degrees
Send booklets from original participants to a target, unknown person
(Lin, et al) show that successful chains made heavy use of weak ties
Weak Ties in Use
Racial integration in schools
Job satisfaction in psych hospital
The benefits of weak ties
Connect people to different social circles, exposing them to more information
Many more of them in a person’s life than strong ties
The network of strong ties
Measuring Tie Strength
Time
Emotional Intensity
Intimacy
Reciprocal Services
Measuring Tie Strength
Additional Features
Social Distance
Structural
Emotional Support
Quantifying Measurements
Time
Emotional Intensity
Intimacy
Reciprocal Services
Social Distance
Structural
Emotional Support
Measurement Overlaps
Network Structure – Forbidden Triad
Network Structure - Bridges
Tie Strength and Propagation
Strong ties – more trusted
Weak ties – wider spread