Basic Idea: Scatterplots Example: Scatter Plot › ~silvia › linz › vu-infovis › PDF... ·...
Transcript of Basic Idea: Scatterplots Example: Scatter Plot › ~silvia › linz › vu-infovis › PDF... ·...
© Silvia Miksch
Part 1
Geometric TechniquesScatterplots,Parallel Coordinates, ...
© Silvia Miksch
Geometric Techniques• Basic Idea
– Visualization of Geometric Transformationsand Projections of the Data
• Scatterplots [Cleveland 1993]• Parallel Coordinates [Inselberg 1985/1990]• Prosection Views [Spence 95]• Landscape [Wise, et al. 1995]• ThemeRiver [Havre, et al 2000]• Hyperslice [van Wijk, et al 1993]
[Keim, 2001]
© Silvia Miksch
Basic Idea: Scatterplots• Visualizes a Relation (Correlation)
between two Variables X and Y– e.g., weight and height
• Individual Data Points are Represented– in 2D
• where axes represent the variables• X on the horizontal axis• Y on the vertical axis
– in 3D– in ...
© Silvia Miksch
Example: Scatter PlotHouse data:
Price and Number of bedrooms
50K 100K 150K 200K 250K 300K
1
2
3
4
5
6
Number of Bedrooms
Price (£)
User can identify globaltrends, local trade-offsand outliners.
© Silvia Miksch
Examples: Scatterplots (1/3)No relationship Strong linear (positive correlation)
Strong linear (negative correlation) Exact linear (positive correlation)
© Silvia Miksch
Examples: Scatterplots (2/3)Quadratic relationship Exponential relationship
Sinusoidal relationship (damped) Outlier
© Silvia Miksch
Examples: Scatterplots (3/3)Variation of Y doesn't depend on X (homoscedastic)
Variation of Y does depend on X (heteroscedastic)
© Silvia Miksch
Scatterplot - Conditioning PlotOne limitation of thescatterplot matrix isthat it cannot showinteraction effects withanother variable
Purpose:Check pairwiserelationship betweentwo variablesconditional on a thirdvariable
temp: torque versus time
© Silvia Miksch
3 D Data in the Box3 D Data Set of50 Observations
in the Box
Scatterplot Matrixof all pairwiseScatterplots
© Silvia Miksch
Example: Cars[Becker & Cleveland, 1996]
© Silvia Miksch
Example: Cars - Scatterplots
m x m scatterplots
diagonal = same(m2 - m)
left -right the same(m2 - m)/2
© Silvia Miksch
Example 2 - Cars - Scatterplot
© Silvia Miksch
3 D Scatterplot plus Color
© Silvia Miksch
Scatterplot & SDOF (1/2)
© Silvia Miksch
Scatterplot & SDOF (2/2)
© Silvia Miksch
Basic Idea: Parallel Coordinates• Assigns one Vertical Axis to each Variable
– Evenly spaces these axes horizontally– Traditional Cartesian Coordinates
• All axes are mutually perpendicular
• Layout: k Parallel Axes• Axes to [min, max]
– Scaling individually for each variable• Polygonal Line
– Every data item correspond to a polygonal line– Intersects each of the axes at the point– Corresponds to the value for the attribute
[Inselberg and Dimsdale, 1990]
© Silvia Miksch
Parallel Coordinates[Inselberg and Dimsdale, 1990]
© Silvia Miksch
Parallel Coordinates
© Silvia Miksch
Parallel Coordinates– Basic• 6-dim. Point with
cordinates (-5,3,4,-2,0,1)T
• one line:point in PC
• one circle:
© Silvia Miksch
Visualization of Correlation• Discover the
Correlation
© Silvia Miksch
Problems with Parallel Cord.• Polygons need to Much Space
© Silvia Miksch
Color in Parallel Coordinates
© Silvia Miksch
Hierach Parallele Coord.
© Silvia Miksch
Example: Cars - Parallel Cord.
© Silvia Miksch
Parallel Coordinates• Demo Programs:
Parallel Coordinates VisualizationApplet
http://csgrad.cs.vt.edu/~agoel/parallel_coordinates/
© Silvia Miksch
Benefits and Limitations• Benefit
– Represent data greater than three dimensions– Opportunities for human pattern recognition– Flexibility: each coordinate can be individually
scaled– Zooming in or out: effectively brushing out or
eliminating portions of the data set• Limitations
– As the number of dimensions increases, theaxes come closer to each other, making it moredifficult to perceive patterns
© Silvia Miksch
Prosection Views• Similar to Scatterplots• m-dim Data Sets• Operators
– Projections– Selections
• Color Coding– customer’s requirements (different limits)
• yes: red or green• no: black, dark gray, light gray, and white
[Spence, et al. 1995]
© Silvia Miksch
The Prosection MatrixDesign of a chair seat
Area
Thickness
Thickness
Area
too heavy
too uncomfortable
too flexible
too large
A design is represented bya point in Area-Thicknessspace
Various performance limitsrestrict the range ofpossible designs
[ © 2001 Robert Spence]
© Silvia Miksch
The Prosection MatrixProblem: we don’t know wherethe green area is located
Area
Thickness
Moreover, there are typicallymany parameters (not 2) andmany performance limits(not 2)
Solution?Either iterative search (human, automated or mixed) orgeneration of data to visualise.
[ © 2001 Robert Spence]
© Silvia Miksch
The Prosection Matrix [ © 2001 Robert Spence]
© Silvia Miksch
Color Coding
Par 2
Par 1
Upper Limit S1
Lower Limit S1
Upper Limit S2
Lower Limit S2
Tolerance Region
Satisfied all limits
Satisfied all theperformance limits,but outsideone parameter limit= not manufactured
Fail one or moreperformance limits,but manufactured
Fail oneperformance limits,but manufactured
Fail one or moreperformance limits,not manufactured
Parameter limits vs Performance limits
© Silvia Miksch
The Prosection Matrix• A Prosection:
Projection of a section
[ © 2001 Robert Spence]
© Silvia Miksch
The Prosection Matrix
Prosection Matrixfor the lamp design
A difficult cognitiveproblem is eased bya simple perceptualtask
[ © 2001 Robert Spence]
Tolerances onparameter values
© Silvia Miksch
The Prosection Matrix [ © 2001 Robert Spence]
RawData
ModelParameters Performances
Selection Encoding Presentation
User
Interaction
Customer’sPerformanceRequirements
Visualization tool designer
The visualization tool (e.g., Influence Explorer) designer must take intoaccount the need of the user to specify the model, the exploratory rangeof parameter values and the customer’s performance specifications, aswell as the selection, encoding and presentation of data.
© Silvia Miksch
LandscapeData needs to be transformed into a (possible artificial) 2Dspatial representation which preserves the characteristicsof the data
[Wise, et al. 1995]
© Silvia Miksch
ThemeRiver:
Visualizing Theme Changes over Time
Susan Havre, Beth Hetzler, and Lucy NowellBattelle Pacific Northwest Division,Washington, USA
Applications I: Document Visualization
IEEE Symposium onInformation Visualization - InfoVis 2000
Excursus
© Silvia Miksch
InfoVis 2000: Facts'n'Figures• October 9-10, 2000• Salt Lake City, Utah, USA• Annual Conference/Symposium
– 6th• Parent Conference
– IEEE Visualization 2000 ---- 11th• Proceedings:
– CD Rom– IEEE Computer Society, Los Alamitos, CA
© Silvia Miksch
IEEE Visualization 2000• Annual Conference• 11th
© Silvia Miksch
© Silvia Miksch
Types of Papers at InfoVis• Keynote Address
– Jock D. Mackinlay, University of Aarhus, DenmarkPresentation, Visualization, What's Next
• Coining the term InfoVis• Visual Data Mining• “Readings in InfoVis”
• 20 Papers - 5 Sessions• 6 Papers - Late Breaking Hot Topics• Capstone Address
– Nahum Gershon, MITREVisual Storytelling - Where Technology and CultureMeet © Silvia Miksch
Session Topics• Visual Querying and Data Exploration• Graphs and Hierarchies• Taxonomies, Frameworks, and
Methodology• Applications I:
– Document Visualization, CollaborativeVisualization, Techniques
• Applications II:– Algorithm Visualization, 3D Navigation
© Silvia Miksch
ThemeRiver:
Visualizing Theme Changes over Time
Susan Havre, Beth Hetzler, and Lucy NowellBattelle Pacific Northwest Division,Washington, USA
Applications I: Document Visualization
© Silvia Miksch
Idea• A Large Collection of Documents• Themes Changes• River Metaphor
- “helps users to identify time-relatedpatterns, trends, and relationships across alarge collection of documents”
• A Prototype System
© Silvia Miksch
ThemeRiverTM
Data set:collection of
speeches,interviews,
articles, andother text
associatedwith Fidel
Castro
© Silvia Miksch
Histograms
Data set:collection of
speeches,interviews,
articles, andother text
associatedwith Fidel
Castro
© Silvia Miksch
User Interactions• Display Topic and Event Labels• Display Time and Event Grid Lines• Display the Raw Data Points• Choose Among Drawing algorithms for
the Currents and River
• Pan and Zoom – Other Time Periods or Parts of the River– More Detail or Broader Context
© Silvia Miksch
Usability Evaluation• 2 Users & Questions:
– Do users understand the metaphor?– Can they identify themes that are more often
discussed?– Does the visualization help them raise new
questions about the data?– Do they interpret details of the visualization in
ways we had not expected?– How does their interpretation of the
visualization differ from that of a histogramshowing the same data?
© Silvia Miksch
Evaluation Results• ThemeRiver Easy to Understand• Useful + / -
+ River Metaphor+ Abstraction to the Whole Collection+ Identifying Macro Trends
- Identifying Minor Trends
© Silvia Miksch
Improvements• Features of the Histogram
– Seeing Numeric Values (on demand)– Total Number of Documents
• Features to Access the Documents• User-Defined Ordering
– Reorder the Theme Currents– Ordering by Correlation
• Parallel Rivers
© Silvia Miksch
Impr.: Features of the Histogram
Data set:1990
AssociatedPress (AP)
newswire datafrom theTREC5
distributiondisks, a set ofover 100,000
documents
© Silvia Miksch
Impr.: Parallel Rivers
Data set:compare 1990
Associated Press(AP)
AP with datafrom Washington,
D.C. and New Yorkfrom the
same time period
© Silvia Miksch
Color Family
Tracking relatedthemes is simplified byassigning them to the
same color family. Thisensures relatedthemes appear
together and areidentifiable as a group.
© Silvia Miksch
Conclusion• River Metaphors• Perception Principles [Ware 2000]
• Improvements Needed– Event Time Line - Automatically– Selecting and Ordering of Theme Currents– More Information/Data on Demand
© Silvia Miksch
Hyperslices• (m2 - m)/2 2D slices• Operator
– Selection
[ van Wiik, et al 1993]