ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other...
-
Upload
megan-love -
Category
Documents
-
view
216 -
download
0
Transcript of ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other...
ENV 2006 4.1
Envisioning Information
Lecture 4 – Multivariate Data Exploration
Glyphs and other methods
Hierarchical approaches
Ken Brodlie
ENV 2006 4.2
Glyph Techniques
ENV 2006 4.3
Glyph Techniques
• Map data values to geometric and colour attributes of a glyph – or marker symbol
• Very many types of glyph have been suggested:
– Star glyphs– Faces – Arrows– Sticks– Shape coding
ENV 2006 4.4
Glyph Layouts
• How do we place the glyphs on a chart?
• Sometimes there will be a natural location – for example?
• If not… two of the variates can be allocated to spatial position, and the remainder to the attrributes of the glyph
ENV 2006 4.5
Glyph Techniques – Star Plots
• Each observation represented as a ‘star’
• Each spike represents a variable
• Length of spike indicates the value
ENV 2006 4.6
Glyph Techniques – Star Plots
• Each observation represented as a ‘star’
• Each spike represents a variable
• Length of spike indicates the value
Crime inDetroit
ENV 2006 4.7
Star Glyphs – Iris Data Set
ENV 2006 4.8
• Chernoff suggested use of faces to encode a variety of variables - can map to size, shape, colour of facial features - human brain rapidly recognises faces
Chernoff Faces
ENV 2006 4.9
Chernoff Faces
• Here are some of the facial features you can use
http://www.bradandkathy.com/software/faces.html
ENV 2006 4.10
Chernoff Faces
• Demonstration applet at:– http://www.hesketh.com/schampeo/projects/Faces/
ENV 2006 4.11
Chernoff’s Face
• .. And here is Chernoff’s face
http://www.fas.harvard.edu/~stats/People/Faculty/Herman_Chernoff/Herman_Chernoff_Index.html
ENV 2006 4.12
Stick Figures
• Glyph is a matchstick figure, with variables mapped to angle and length of limbs • As with Chernoff faces, two
variables are mapped to display axes
• Stick figures useful for very large data sets
• Texture patterns emerge
• Idea due to RM Pickett & G Grinstein
- different anglesthat may be variedare shown
ENV 2006 4.13
5D imagedata fromGreat Lakesregion
Stick Figures
ENV 2006 4.14
• Suitable where a variable has a Boolean value, ie on/off• A data item is represented as an array of elements, each
element corresponding to a variable
1
2
3
4
5
6
shade in boxif value ofcorrespondingvariable is ‘on’
Arrays laid out in a line, or plane, as with othericon-based methods
Shape Coding
ENV 2006 4.15
Time series of NASAearthobservationdata
Shape Coding
ENV 2006 4.16
Dry
Wet
Showery
Saturday
Sunday
Leeds
Sahara
Amazon
* variables and their values placed around circle
* lines connect the values for one observation
This item is { wet, Saturday, Amazon }http://www.daisy.co.uk
Daisy Charts
ENV 2006 4.17
Daisy Charts - Underground Problems
ENV 2006 4.18
Daisy Charts – News Analysis
• Four variates: day, source, search terms, keywords
ENV 2006 4.19
Reducing Complexity in Multivariate Data Exploration
ENV 2006 4.20
Clustering as a Solution
• Success has been achieved through clustering of observations
• Hierarchical parallel co-ordinates
– Cluster by similarity– Display using translucency
and proximity-based colour
http://davis.wpi.edu/~xmdv/docs/vis99_HPC.pdf
ENV 2006 4.21
Comparison
One of 3 clusters
ENV 2006 4.22
Hierarchical Parallel Co-ordinates
ENV 2006 4.23
Reduction of Dimensionality of Variable Space
• Reduce number of variables, preserve information
• Principal Component Analysis– Transform to new co-ordinate
system– Hard to interpret
• Hierarchical reduction of variable space
– Cluster variables where distance between observations is typically small
– Choose representative for each cluster
• Subgroup has then been identified – showing what?
http://davis.wpi.edu/%7Exmdv/docs/vhdr_vissym.pdf
42 dimensions, 200 observations