Visualization of Multidimensional Multivariate Large Dataset Presented by: Zhijian Pan...

26
Visualization of Visualization of Multidimensional Multidimensional Multivariate Large Multivariate Large Dataset Dataset Presented by: Presented by: Zhijian Pan Zhijian Pan [email protected] [email protected] University of Maryland University of Maryland
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    223
  • download

    0

Transcript of Visualization of Multidimensional Multivariate Large Dataset Presented by: Zhijian Pan...

Visualization of Visualization of Multidimensional Multivariate Multidimensional Multivariate Large DatasetLarge Dataset

Presented by:Presented by:

Zhijian PanZhijian Pan

[email protected]@cs.umd.edu

University of MarylandUniversity of Maryland

DescriptionDescription Covered papers:Covered papers:

– Alfred Inselberg, Multidimensional DetectiveAlfred Inselberg, Multidimensional Detective– Ted Mihalisin, Visualizing Multivariate Ted Mihalisin, Visualizing Multivariate

Functions, Data, and DistributionsFunctions, Data, and Distributions

The problem:The problem:• Visualization and analysis of large dataset with Visualization and analysis of large dataset with

multiple parameters or factors, and the key multiple parameters or factors, and the key relationships among themrelationships among them

• MDMV problemMDMV problem

Key words explanationKey words explanation Multidimensional:Multidimensional:

– The dimensionality of independent variables The dimensionality of independent variables Multivariate:Multivariate:

– The dimensionality of dependent variablesThe dimensionality of dependent variables Example:Example:

– 3-D volume space+temperature+pressure 3-D volume space+temperature+pressure produces 3D2V dataproduces 3D2V data

The data set could The data set could largerlarger than number of than number of pixelspixels

Four Stages of DevelopmentFour Stages of Development 1st:Graphical representation of either one or two 1st:Graphical representation of either one or two

variate data, e.g. scatterplot, scatterplot matrixvariate data, e.g. scatterplot, scatterplot matrix 22ndnd:Two dimensional graphics, but encoding :Two dimensional graphics, but encoding

multiple parameters, e.g. color, size,shape codingmultiple parameters, e.g. color, size,shape coding 33rdrd:High dimensional graphics, high speed :High dimensional graphics, high speed

computation, single display, such as Parallel computation, single display, such as Parallel CoordsCoords

44thth:elaboration and assessment of various :elaboration and assessment of various visualization techniquesvisualization techniques

MDMV Visualization CategoryMDMV Visualization Category

Broadly categorized into five groups:Broadly categorized into five groups:– BrushingBrushing– Panel MatrixPanel Matrix– IconographyIconography– Hierarchical DisplaysHierarchical Displays– Non-Cartesian DisplaysNon-Cartesian Displays

Group 1Group 1

BrushingBrushing– Direct manipulation of MDMV visualization Direct manipulation of MDMV visualization

display:labeling, enhanced linkingdisplay:labeling, enhanced linking

– E.g. brushing a scatterplot matrixE.g. brushing a scatterplot matrix

Group 2Group 2

Panel Matrix (pairwise 2-D plot, n-D box)Panel Matrix (pairwise 2-D plot, n-D box)– E.g. Hyperbox: n*n lines, n*(n-1)/2 facesE.g. Hyperbox: n*n lines, n*(n-1)/2 faces– Elaboration of scatterplot matrixElaboration of scatterplot matrix– Adding interactive data navigation (hyperbox Adding interactive data navigation (hyperbox

cutting)cutting)

Group 3Group 3

Iconography: Glyphs: graphical entities Iconography: Glyphs: graphical entities which encode MDMV with shape, size, which encode MDMV with shape, size, color, and position. color, and position. – E.g. faceglyph: size and position of eyes, nose, E.g. faceglyph: size and position of eyes, nose,

mouth; curvature of mouth; angle of eyebrowsmouth; curvature of mouth; angle of eyebrows

Group 4Group 4

Hierarchical Displays: Hierarchical Displays: – map a subset of variates into different map a subset of variates into different

hierarchical displayhierarchical display– Dynamic interactive analysisDynamic interactive analysis– the Ted Mihalisin paper, more details followedthe Ted Mihalisin paper, more details followed

Group 4 (cont’d)Group 4 (cont’d)

New term: speed=the hierarchical axesNew term: speed=the hierarchical axes E..g. Three variables:x,y,and z: {0,1,2} E..g. Three variables:x,y,and z: {0,1,2} X the fastest axis, Z the slowest axisX the fastest axis, Z the slowest axis

Group 4 (Cont’d)Group 4 (Cont’d)

Visualizing 3 Visualizing 3 variables:variables:– 2 interdependent 2 interdependent

variables: x, y: variables: x, y: • x= -2, -1, 0, 1, 2; x= -2, -1, 0, 1, 2;

• y= -2, -1, 0, 1, 2y= -2, -1, 0, 1, 2

– 1 dependent variable: z 1 dependent variable: z = x**2 + y**2= x**2 + y**2

– so, a 2D1V problemso, a 2D1V problem

– x fastest, y slowestx fastest, y slowest

Group 4 (Cont’d)Group 4 (Cont’d)

3d1v: W = (x**2) * (e**-y) + z3d1v: W = (x**2) * (e**-y) + z

• Top panel speed order : x, y, z

• Bottom panel speed order: z, y, x

Group 4 (cont’d)Group 4 (cont’d)

What if the number of the data points What if the number of the data points greatly exceeds the number of horizontal greatly exceeds the number of horizontal pixels assigned to the panel?pixels assigned to the panel?

Example: 7 independent variables + each Example: 7 independent variables + each has 10 values = 10,000,000 pointshas 10 values = 10,000,000 points

Need:Need:– hierarchical subspace zooming to reduce hierarchical subspace zooming to reduce

dimension dimension

Group 4 (cont’d)Group 4 (cont’d)

From 7D to 2D:From 7D to 2D:

Group 4 (cont’d)Group 4 (cont’d) example: experiment example: experiment

data visualization:data visualization:– Dependent: specific Dependent: specific

heatheat– Independent: Independent:

• Fastest: temperature Fastest: temperature (white) :gaussian peak(white) :gaussian peak

• Then alloy Then alloy concentration (blue): concentration (blue): linear increaselinear increase

• Then magnetic field Then magnetic field (red) :nonlinear (red) :nonlinear decreasedecrease

Group 5Group 5

Parallel CoordinatesParallel Coordinates– So many class presentations have already been So many class presentations have already been

done!done!– Everybody is already expert using itEverybody is already expert using it– What are some basic ideas behind it?What are some basic ideas behind it?– Cartesian v.s. Parallel Coords Cartesian v.s. Parallel Coords

Group 5 (cont’d)Group 5 (cont’d)

A Cartesian line:A Cartesian line:– L: xL: x22 = mx = mx11+b+b

– A set of points sampled A set of points sampled on this lineon this line

• On Parallel Coords:– Each point becomes a line– The set of points becomes a

set of intersecting lines

Group 5 (cont’d)Group 5 (cont’d)

The intersect point:The intersect point:

The location of the The location of the intersect point is intersect point is important!important!– Between two axes: Between two axes:

inversely proportional inversely proportional (x1 (x1 α 1/x2)α 1/x2)

– Outside two axes: Outside two axes: directly proportional directly proportional (x1 (x1 α x2)α x2)

Group 5 (cont’d)Group 5 (cont’d)

Application exampleApplication example– Aircraft collision Aircraft collision

checkingchecking

– Converting the Converting the problem into detecting problem into detecting a four dimension a four dimension geometric intersectiongeometric intersection

– Collision at (2,2,2,1)Collision at (2,2,2,1)

Group 5 (cont’d)Group 5 (cont’d) Application example:Application example:

– Economic model of a Economic model of a real countryreal country

– 8 variables:8 variables:• AgricultureAgriculture• FishingFishing• MiningMining• ManufacturingManufacturing• ConstructionConstruction• GovernmentGovernment• MiscellaneousMiscellaneous• GNPGNP

Group 5 (cont’d)Group 5 (cont’d)

A Least Squares A Least Squares function defines the function defines the boundary region in 8 boundary region in 8 dimension spacedimension space

Any point (polygon) Any point (polygon) inside the boundary inside the boundary represents a feasible represents a feasible economic policy for economic policy for the country the country

Group 5 (cont’d)Group 5 (cont’d)

Discoveries:Discoveries:– No policy would favor No policy would favor

Agriculture without Agriculture without also favoring Fishing: also favoring Fishing: (x1 (x1 α x2)α x2)

– Inverse relationship Inverse relationship between Fishing and between Fishing and Mining: resource Mining: resource competition: competition:

(x1 (x1 α 1/x2)α 1/x2)

Notes on the ReferencesNotes on the References

The Inselberg’s paper:The Inselberg’s paper:– 11 citations found on 11 citations found on

researchIndexresearchIndex

– Application in Application in knowledge discovery, knowledge discovery, user interface, aircraft user interface, aircraft design, etc.design, etc.

Ted Mihalisin paper:Ted Mihalisin paper:– Only one citation Only one citation

foundfound

ContributionContribution Inselberg’s paper:Inselberg’s paper:

– Transform MDMV hyperspace relations into a Transform MDMV hyperspace relations into a 2-D geometric pattern problem2-D geometric pattern problem

– empirical studies demonstrated the ability empirical studies demonstrated the ability extending the strength with trade-off analysis, extending the strength with trade-off analysis, discover sensitivities, and optimizationdiscover sensitivities, and optimization

Mihalisin’s paper:Mihalisin’s paper:– Hierarchical technique visualizing data points Hierarchical technique visualizing data points

greatly exceeding number of pixels greatly exceeding number of pixels

CritiqueCritique

Inselberg’s paper:– No comparison with other MDMV techniques– No examples supporting the claim that

displayed objects can be recognized under projective transformations

Mihalisin’s paper:– Limited number of values for each variable

visualized in one display– No discussion of potential information loss

with coarse-grained grid

Favorite SentenceFavorite Sentence “You can’t be unlucky all the time!”

– Multiple techniques exist for MDMV visualization problem

– Each has strength and weakness– Whichever you start with, you can’t be unlucky

all the time!– Integration and collaboration of existed tools

remain to be active research topics.