Visual Analysis Algebra

26
Visual Analysis Algebra Anna Shaverdian, Hao Zhou H. V. Jagadish, George Michailidis University of Michigan

description

Visual Analysis Algebra. Anna Shaverdian, Hao Zhou H. V. Jagadish , George Michailidis University of Michigan. Find a criminal network within a network? 50 different solutions, 5 minute videos to explain process, pages of text…. Desired Features in Visual Analysis. - PowerPoint PPT Presentation

Transcript of Visual Analysis Algebra

Visual Analysis Algebra

Anna Shaverdian, Hao ZhouH. V. Jagadish, George Michailidis

University of Michigan

Find a criminal network within a network? 50 different solutions, 5 minute videos to explain process, pages of text…

Desired Features in Visual Analysis

• Mix and match ideas from multiple projects• Compare/Validate tools and techniques• Document and reproduce results from

another’s visual analysis– Not ambiguous– Not wordy

• Optimize techniques

3

Visual Analysis Algebra• Graph Model• Predicate/ Witness• Graph Matching Function• Operators

– Selection– Labeling– Aggregation

• Helper Functions– Visual Operators

4

Graph Model• Attributed Graph: D = [G,X]• Graph: G= (V,E)

– Each node assigned unique id through λ(vertex) function– Allows directed, multi-edge graphs

• (Direction captured as an edge attribute)

• Attributes: X = (XV, XE, XG)– Each attribute has a name, type, and value

• Attributes can be intrinsic or computed– Intrinsic: independent features which stay constant if

graph topology changes– Computed: Created through composition functions

• Examples: degree, betweeness, centrality

5

Example Graph Model

• Cell Phone Network: node represents a phone and an edge represents a call between two phones– D = [G = (V,E), X = (Xv = {XphoneID}, XE = {Xdate, Xduration,

Xtower, XcallerID}, XG = {})]• Initial data set with intrinsic attributes

• Perform operations on sets of attributed graphs (closed algebra)– {Dday1, Dday2, …, Dday10}

6

Predicate Definition

• p = (V, E, XV, XE, XG, !E)• V,E describe the graph structure• XV, XE, XG describe the conditions on the

attributes in V, E– Example: Xv.weight.node12 < Xv.weight.node10 in XV

• !E describe the excluded edges– An edge e1 in !E doesn’t exist in the graph G and

given a closed universe U, for all S where G is a subgraph of S, then e1 doesn’t exist in S either

7

Witness

• An attributed graph where there exists– Bijection mapping between nodes – The predicate’s conditions all hold on its node,

edge, and graph attributes

1

2

3

Degree < 40Name = John

Weight > node 2’s Weight

5

8

9

Degree = 12Name = JohnWeight = 200lbs

…Weight = 300lbs

8

Example Excluded Edges Witness

• Predicate Attributed Graph

1

2

4

43 1

2

4

43 5

9

Graph Matching Function (γ ): Subroutine used by operators

• Inputs: an attributed graph D and predicate p• Outputs:• A list of witnesses W

– Attributes of the nodes, edges, and graph of witness include all attributes of those respective elements in D

– If one or more witnesses share ids & attributes (ex. same but different rotation) combine to arbitrary one

• A model witness X• Set of mapping lists of the witnesses in W to X

10

Graph Matching Function (γ) Example

1

2

4

3 56

7

9

8

Predicate Attributed Graph

γ

1

2

4

3 6

7

9

8

ID Model ID

1 6

2 7

3 8

4 9

Model witness

Age = 12

Age = 16

Age = 22

Age = 12

Age = 16

Witness found

Mappings

11

Selection Operator σ

• There are two types of selection operators– Work at the attributed graph level– Work at the element (nodes & edges level)

• Both operate on a set of attributed graphs and output a set of graphs

12

Set Selection σset

• Given a set of attributed graphs D and a predicate p• Set Selection outputs the set of graphs where there

exists a witness for the predicate• Example

– The graphs with an average degree greater than 42 – p = (V = {}, E = {}, XV = {}, XE = {}, XG =

{Xg.averageDegree > 42},!E = {})

– σset, p({D1, D2, …, D10}) = D’• where D’ subset of D, for any Di in D’, Xg.averageDegree > 42

13

Element Selection σelement

• Given a set of attributed graphs and a predicate p

• σ element,p ({D1, D2, …, Dn}) = Ui Di’ – where each Di’ = {Wi.p.1, …, Wi.p.k} the k witnesses

of predicate p found in Di

– An attributed graph for each witness found in the set of graphs

14

Example Element Selection σelement

• Select a subgraph from a set of graphs– p = (V = {1, 2, 3}, E = {e12, e23}, XV = {}, XE = {}, XG =

{},!E = {}) – D1 = [ (V=(1,2,3,4), E= {e12, e14 , e23 , e34}, X= ({},{},{})]

15

2

1

3

4

σ element,p ({D1}) 1

3

2

1

3

4

2

4

3

2

4

1

Labeling Operator

• During graph analysis, need a way to “select” nodes of interest, mark them somehow, and continue analysis, sometimes referring to the marked nodes

• We do this by labeling• Given a set of graphs and a predicate

– We modify each graph to remember its match to the predicate

16

Labeling Operator

• For each attributed graph Di where there exists a witness for the predicate (using γ function)– Create the model witness structure x within Di

– Label it with a unique group id– For each witness wj found in Di

• Use the mapping lists to create directed edges between the wj and x

17

Labeling Example

6

7

9

8

Predicate

Attributed Graph

Labeling

1

2

4

3 5

Age = 12

Age = 16

Age = 22

18

6

7

9

8

1

2

4

3 5

Age = 12

Age = 16

Age = 22

Each edge has a group id to say its an edge to

a model witness and a structure id, to say its one witness

found

Example: Labeling & Visual Analysis

• Given a Social Network• We have a suspected terrorist subnetwork

and some features of interest• Analyze the subgraphs that match the

suspected subnetwork– Predicate structure isn’t the final structure we’re

looking for, it’s an intermediate step

• VAST 2009 challenge

19

Example: Labeling & Visual Analysis

Degree = 40

Geographic size = small island

Helper Functions

• Visual Operators– Ex. Feed values into a histogram, layouts,

presentation

• Creating/Deleting– Create/Delete a set of nodes/edges/attributes– Copy a graph

21

Phone Record Case Study

22

•In an attempt to characterize the entire network, we loaded the entire data set into MobiVis, which links people (blue nodes) if they had a phone conversation. Unfortunately, the tight connectivity of the resulting network made it impossible to find interesting patterns. Following the lead that person 200 is likely to be FerdinandoCatalano, we filtered the data to visualize only its closest nodes. Figure 1 shows the social network of person 200. Figure 1. Overview of the social network of FerdinandoCatalano (id 200). This reflects the general social structure over, at least, the first seven days. We can further characterize this network by looking at the links between the immediate neighbors of person 200. Persons 5, 200, 97 and 137 seem to form a clique, whereas persons 1,2 and 3 form another. Looking at the amount of communication between those, which is depicted as the thickness of the edges, we discovered that 200 and 5 talk a lot among themselves. The color coding of the edges helps visualize the symmetry of the calls. For example, a warm color (orange) in the middle indicates a symmetric connection (both parties call each other frequently), whereas a biased orange color indicates more calls in the direction of the bias. We then characterized the network as being the connection of the two families: the Catalanos, represented in persons 200 (FerdinandoCatalano), 5 (which we believe is EstabanCatalano, since its tight connection to 200), 97 and 137. And the Vidros, represented in persons 1,2 and 3. We can further characterize the substructure of the Vidrosas hierarchical. Although it was not evident at first, person 1 always calls persons 2 and 3, which led us to believe that he has a role of coordinator. We validated this with another capability of MobiVis, which allows us to display people in the social network according to some semantic filtering criteria. In Figure 2(a), we display the people called by 1 and people who called person 1 . Those people who called person 1 are connected to an orange node, while people who where called by person 1 are connected to a red node. We can see that person 1 had a bi-directional communication with FerdinandoCatalano, but only in one direction with 2,3 and 5. Figure 2(b) shows the same analysis for person 5. We noticed an inverse behavior: 1, 2 and 3 always call 5, but not vice versa. Furthermore, it helped us characterized the social structure better. The high symmetry of communication between 200 and 5 validates our claim about their identities being of Ferdinandoand EstabanCatalano, respectively. Person 1, however, seems to coordinate the

efforts of 2,3 and 5, which suggests that he can be associated to David

Phone Record Case Study• Original data set (10 days)

– D = [G = (V,E), X = (Xv= {XphoneID}, XE= {Xdate, Xduration, Xtower, XcallerID}, XG= {})]

• View Entire Graph• Create 10 graphs (per day)

– Predicate for day i calls• pday_i=(V = {v1, v2}, E = {e12 }, XV= {}, XE= {Xe.12.day= i}, XG= {},!E = {})

– Labeling by day• μday_i{{D}}

• Element Selection on day_igroup– Фelement,day_i{{D}} = {D1, D2, D3, D4, D5, D6, D7,, D8,, D9,, D10}

• View Each Graph

23

Phone Record Case Study• Look at pattern change in node 200’s neighborhood

– Predicate for node 200 neighbor• p200Neighbor=(V = {v1, v2}, E = {e12 }, XV= {Xv.1.callerID= 200}, XE= {}, XG= {},!E

= {}) – Labeling by day

• μ200Neighbor{{D1, D2, … , D10}} – Selection on 200 neighbor group

• Фelement,200Neighbor{{D1, D2, … , D10}} = {D1’, D2’, D3’, D4’, D5’, D6’, D7’, D8’, D9’, D10’}

• Aggregate days 1-7 and days 8-10 graphs– Set Aggregation

• Фset, {pdays1-7. pdays8-10}({D1’, D2’, …, D10’}) = {Dday1-7, Dday8-10}– Element Aggregation on CallerID

• Фelement, {pdays1-7. pdays8-10}({Dday1-7, Dday8-10 })

24

Algebraic Visual Analysis: The Catalano Phone Call Data Set Case

Study• Anna Shaverdian, Hao

Zhou, George Michailidis, and H.V. Jagadish, VAKD ’09

• Simulate many existing analytical workflows with operators from visual analytic algebra

• Ability to do analysis beyond existing workflows

Multiple Step Social Structure Analysis with Cytoscape

• Hao Zhou, Anna Shaverdian, H.V. Jagadish, George Michailidis, VAST ’09

• VAST ‘09 Flitter Mini Challenge Award: Good Tool Adaption

• Demonstrates Cytoscape’s utility in identifying the structure in a social network