Graphical Examination of Data
description
Transcript of Graphical Examination of Data
![Page 2: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/2.jpg)
Sources
H. Anderson, T. Black: Multivariate Data Analysis, (5th ed., p.40-46).
Yi-tzuu Chien: Interactive Pattern Recognition, (Chapter 3.4).
S. Mustonen: Tilastolliset monimuuttujamenetelmät, (Chapter 1, Helsinki 1995).
![Page 3: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/3.jpg)
Agenda
Examining one variable Examining the relationship between two
variables 3D visualization Visualizing multidimensional data
![Page 4: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/4.jpg)
Examining one variable
Histogram– Represents the frequency of occurences
within data categories• one value
(for discrete variable)• an interval
(for continuous variable)
![Page 5: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/5.jpg)
Examining one variable
Stem and leaf diagram (A&B)– Presents the same graphical information
as histogram– provides also an
enumeration of the actual data values
![Page 6: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/6.jpg)
Examining the relationship between two variables Scatterplot
– Relationship of two variables
Linear
Non-linear
No correlation
![Page 7: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/7.jpg)
Examining the relationship between two variables Boxplot (according A&B)
– Representation of data distribution
– Shows:• Middle 50% distribution• Median (skewness)• Whiskers• Outliers• Extreme values
![Page 8: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/8.jpg)
3D visualization
Good if there are just 3 variables Mustonen: “Problems will arise when we
should show lots of dimensions at the same time. Spinning 3D-images or stereo image pairs give us no help with them.”
![Page 9: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/9.jpg)
Visualizing multidimensional data Scatterplot with varying dots Scatterplot matrix Multivariate profiles Star picture Andrews’ Fourier transformations Metroglyphs (Anderson) Chernoff’s faces
![Page 10: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/10.jpg)
Scatterplot
Two variables for x- and y-axis Other variables can be represented by
– dot size, square size– height of rectangle– width of rectangle– color
![Page 11: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/11.jpg)
Scatterplot matrix
Also named as Draftsman’s display Histograms on diagonal Scatterplot on lower portion Correlations on upper portion
![Page 12: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/12.jpg)
Scatterplot matrix (cont…)
correlations
histograms
scatterplots
![Page 13: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/13.jpg)
Scatterplot matrix (cont…)
Shows relations between each variable pair
Does not determine common distribution exactly
A good mean to learn new material Helps when finding variable
transformations
![Page 14: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/14.jpg)
Scatterplot matrix as rasterplot
Color level represents the value – e.g. values are mapped to gray levels 0-
255
![Page 15: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/15.jpg)
Multivariate profiles
A&B: ”The objective of the multivariate profiles is to portray the data in a manner that enables each identification of differences and similarities.”
Line diagram– Variables on x-axis– Scaled (or mapped) values on y-axis
![Page 16: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/16.jpg)
Multivariate profiles (cont…)
An own diagram for each measurement (or measurement group)
![Page 17: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/17.jpg)
Star picture
Like multivariate profile, but drawn from a point instead of x-axis
Vectors have constant angle
![Page 18: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/18.jpg)
Andrews’ Fourier transformations D.F. Andrews, 1972. Each measurement X = (X1, X2,..., Xp)
is represented by the function below, where - < t < .
...)cos()sin()cos()sin(2
)( 54321 tXtXtXtXXtfx
![Page 19: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/19.jpg)
Andrews’ Fourier transformations (cont…) If severeal measurements are put into the same
diagram similar measurements are close to each other.
The distance of curves is the Euklidean distance in p-dim space
Variables should be ordered by importance
![Page 20: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/20.jpg)
Andrews’ Fourier transformations (cont…)
![Page 21: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/21.jpg)
Andrews’ Fourier transformations (cont…) Can be drawn also using polar
coordinates
![Page 22: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/22.jpg)
Metroglyphs (Andersson)
Each data vector (X) is symbolically represented by a metroglyph
Consists of a circle and set of h rays to the h variables of X.
The lenght of the ray represents the value of variable
![Page 23: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/23.jpg)
Metroglyphs (cont...)
Normally rays should be placed at easily visualized and remembered positions
Can be slant in the same direction– the better way if there is a large number of
metrogyphs
![Page 24: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/24.jpg)
Metroglyphs (cont...)
Theoretically no limit to the number of vectors
In practice, human eye works most efficiently with no more than 3-7 rays
Metroglyphs can be put into scatter diagram => removes 2 vectors
![Page 25: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/25.jpg)
Chernoff’s faces
H. Chernoff, 1973 Based on the idea that people can
detect and remember faces very well Variables determine the face features
with linear transformation Mustonen: "Funny idea, but not used in
practice."
![Page 26: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/26.jpg)
Chernoff’s faces (cont…)
Originally 18 features– Radius to corner of face OP– Angle of OP to horizontal– Vertical size of face OU – Eccentricity of upper face– Eccentricity of lower face– Length of nose– Vertical position of mouth– Curvature of mouth 1/R– Width of mouth
![Page 27: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/27.jpg)
Chernoff’s faces (cont…)
Face features (cont…)– Vertical position of eyes– Separation of eyes – Slant of eyes– Eccentricity of eyes– Size of eyes– Position of pupils– Vertical position of eyebrows– Slant of eyebrows– Size of eyebrows
![Page 28: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/28.jpg)
Chernoff’s faces (cont…)
![Page 29: Graphical Examination of Data](https://reader035.fdocuments.us/reader035/viewer/2022081515/56815e80550346895dcd0ce3/html5/thumbnails/29.jpg)
Conclusion
Graphical Examination eases the understanding of variable relationships
Mustonen: "Even badly designed image is easier to understand than data matrix.”
"A picture is worth of a thousand words”