OPTIMAL CONNECTIONS: STRENGTH AND DISTANCE IN VALUED GRAPHS

32
OPTIMAL CONNECTIONS: STRENGTH AND DISTANCE IN VALUED GRAPHS Song Yang and David Knoke SOCI 5013: Advanced Social Research, Spring 2004

description

OPTIMAL CONNECTIONS: STRENGTH AND DISTANCE IN VALUED GRAPHS. Song Yang and David Knoke SOCI 5013: Advanced Social Research, Spring 2004. RESEARCH QUESTION. - PowerPoint PPT Presentation

Transcript of OPTIMAL CONNECTIONS: STRENGTH AND DISTANCE IN VALUED GRAPHS

Page 1: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

OPTIMAL CONNECTIONS: STRENGTH AND DISTANCE IN

VALUED GRAPHS

Song Yang and David KnokeSOCI 5013: Advanced Social

Research, Spring 2004

Page 2: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

RESEARCH QUESTION

• How to identify optimal connections, that is, direct or indirect paths between dyads that permit the highest exchange volumes while taking into account the actors’ costs of interaction?

Page 3: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

CONNECTIONS IN BINARY GRAPHS

• Graph is depicted as a two dimensions by a set of nodes representing actors and a set of lines representing the direct ties between a pair (dyad) respectively.

• We are concentrating on undirected, symmetric graphs that reflect mutual interactions. Marriages between persons, and contracts between corporations are two good cases in point. If A is married to B, B must be married to A as well.

Page 4: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

BINARY GRAPHS

• In binary graphs, the presence of connection between a pair of nodes is indicated by a constant value of 1. In contrast, the absence of connection is indicated by a value of 0.

• In a graph, a path is a set of distinct nodes and lines that connect a specific pair of nodes. A length of a path refers to the number of lines in it. The path distance between two nodes is defined as the length of the shortest path.

Page 5: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

BINARY GRAPHS

• In binary graphs, path distance is normally used to indicate the optimal connections between a pair of nodes. This solution assumes that intermediaries are costly. If more intermediaries are necessary to connect a pair of actors, they may extract higher commissions for their services, distort the information content exchanged, and increase the time required to complete a transaction.

Page 6: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

BINARY GRAPHS EXAMPLE FOR THE DYAD AB PATH LENGTH PATH LENGTH/OPTIMAL CONNECTION A-B 1 1 A-E-B 2 N/A A-E-D-C-B 4 N/A

D

B

C

A

E

1

1

1

1 1

1

Page 7: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

VALUED GRAPHS

• Valued graph is defined as a graph whose lines carry numerical values indicating the intensities of the relationships between all dyads.

• These numbers typically represent frequencies or durations of interactions among social actors

• For example, volumes of communications, levels of friendship and trust, or dollar amounts of economic transactions.

Page 8: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

VALUED GRAPHS

• For organizations engaging in strategic alliances, a valued graph might indicate the numbers of distinct partnerships formed between each pair.

• In valued graphs, using path length to indicate optimal connection is not applicable because the shortest path is less identifiable.

Page 9: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

VALUED GRAPHS

• Previous researchers propose two solutions to measure optimal connections in valued graphs. Peay (1980) states that path value, defined as the smallest value attached to any line in a path, indicates the optimal path between a pair of nodes.

Page 10: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

VALUED GRAPHS EXAMPLE FOR THE DYAD AB PATH PATH VALUE/OPTIMAL CONNECTION A-B 1 A-E-B 3 A-E-D-C-B 2

D

B

C

E

1

3

3

2 4

6

A

Page 11: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

Problems

• The problems of Peay’s path value solutions

• How to determine the path value/optimal connection when multiple paths/path values present between two nodes?

• How to account for the transaction costs of exchanges involving many go-betweens?

Page 12: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

Flament Solution

• Flament (1963) uses path length, defined as the sum of the values of the lines included in a path, to represent the optimal connection between a pair of nodes.

Page 13: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

Flament Solution EXAMPLE FOR THE DYAD AB PATH PATH LENGTH/OPTIMAL CONNECTION A-B 1 A-E-B 6 A-E-D-C-B 15

D

B

C

E

1

3

3

2 4

6

A

Page 14: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

The Problems with Flament’s Path Length Solution

• No standard for which stands for optimal connection among results from Flament’s path length. whether larger or smaller path lengths are viewed as optimal for connecting dyads.

• If larger values indicate optimal connection. Then a high number can result when either (1) the lines in a path have high values, or (2) a path has many lines with low values that sum to a large total. And the solution fails when the second situation occurs.

Page 15: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

More Problems

• Else if lower values represent optimal connection. Then a low number can result when either (1) the lines in a path have low values, or (2) a path has few lines that add up to a small value.

Page 16: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

OUR SOLUTION

• Bring binary distance back to the equations. We argue that including binary distance is especially crucial for measuring path strength in a valued graph because it takes into account the costs (in time, energy, or decay of information) required for indirectly connected dyads to reach one another through varying numbers of intermediaries.

Page 17: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

OUR SOLUTION

• We now formally define two measures of path strength applicable to valued graphs. A valued graph G consists of three sets of information:

• A set of nodes N = {n1, n2, … ng}• A set of lines between pairs of nodes L =

{l1, l2, … lg}• A set of values attached to the lines V =

{v1, v2, … vg}.

Page 18: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

OUR SOLUTION

• A path between nodes ni and nj consists of a sequence of distinct lines connecting the pair through one or more intermediaries, expressed as:

• {li,i+1, li+1,i+2, … lj-2,j-1, lj-1,j}, • The dual subscripts indicate the origin and

terminus nodes of each line.

Page 19: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

OUR SOLUTION

• The minimum value Mij of a path between nodes ni and nj is the smallest value of any line in that path, indicated as

• Mij = min (vi,i+1, vi+1,i+2, … vj-2,j-1, vj-1,j).

• Notice that Mij is actually Peay’s path value.

Page 20: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

OUR SOLUTION

• The distance of that path Dij is the total number of lines where each line has a value of one, which is indicated as

• Dij = (li,i+1 + li+1,i+2 … + lj-2,j-1 + lj-1,j ).• Note that this sum is identical to distance

in a corresponding binary graph, obtained by counting the number of lines in a path connecting nodes ni and nj.

Page 21: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

APV

• Then, a measure of average path value (APV) between nodes ni and nj is the ratio of path value to distance, indicated by

ij

ijij D

MAPV

Page 22: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

APV

• Note that a pair of nodes may have multiple paths, thus containing multiple APVs. We suggest that the highest APV indicates the optimal connection between the pair of nodes because it permits the highest volume of transactions/messages/contracts/treaties/friendships after controlling for the binary distance between the two nodes.

Page 23: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

AN ILLUSTRATION

Path of AB Binary Distance Path Value APV AB 1* 1 1.00 AEB 2 3* 1.50* AEDCB 4 2 0.50

D

B

C

A

E

1

3

3

2 4

6

Page 24: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

IMPLEMENTATION ISSUES

• Unfortunately, available social network software (UCINET) does not work according to our solution. Consider the following example,

Page 25: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

IMPLEMENTATION ISSUES Dyad BC: Binary Distance Path Value/APV BC 1* 2 2.00* BEDC 3 3* 1.00 BAEDC 4 1 0.25

Our Optimal Connection UCINET Solution

D

B

C

E

1

3

3

2 4

6

A

Page 26: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

Why Differ?

• How UCINET chooses a different result to represent the optimal connections? The algorithm works like this,

• Find the highest path value among the multiple paths between a pair of nodes, thinking this is the optimal path.

• In our example, UCINET picks 3 for the path

BEDC, thinking it is the optimal path connecting the dyad BC.

Page 27: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

Why Differ?

• Then, calculating the binary distance associated with the optimal path it just picked up between the pair of nodes.

• In our example, it was 3 for the path BEDC. • Dividing the highest path value by its binary

distance, saying that I get the APV. In our example, it was 3/3=1.

Page 28: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

But We Want

• Finding the path values for all the paths between a dyad.

• Calculating the binary distances for all the paths.

• Dividing each path values by its binary distance, producing multiple APVs for a dyad.

• Picking up the highest APV to represent the optimal connection between the dyad, which is 2/1=2 in our example.

Page 29: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

Consequences

• Such a difference in computing optimal connection between UCINET and our solution produces only one discrepancy in our example with five nodes and 10 dyads.5!/3!*2!=10, which is the maximum number of dyad relationships for 5 actors.

Page 30: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

Consequences

• However, social scientists rarely deal with 5 by 5 matrix. Instead, many of the matrices contain 10s, 100s, or even 1000s of actors, forming symmetrical matrices with many dimensions.

• Suppose we have a matrix with 100 actors. It can have a maximum 100!/2!*98!=4,950 dyads. If UCINET and our solution have 10% disagreement, we are expecting 495 discrepancies between UCINET output and our expected output, which is less tolerable.

Page 31: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

POSSIBLE SOLUTION

• Devise your own algorithm • Some shortest path algorithm such as

Dijkstra’s algorithm or Floyd-Warshall algorithm is not sufficient but provides solid base to solve our problem.

• Implement the algorithm using any languages such as C, C++, JAVA, or FORTRAN.

Page 32: OPTIMAL CONNECTIONS:  STRENGTH AND DISTANCE IN VALUED GRAPHS

Solution

• Yang and Hexmoor (2004) devised a suitable algorithm and implemented it with several JAVA programs

• Classroom illustration of the software is pending for time permission