Applied Multivariate Analysis
description
Transcript of Applied Multivariate Analysis
11
Applied Applied Multivariate Multivariate
AnalysisAnalysis
Introduction Introduction
2
Nature of Multivariate Nature of Multivariate AnalysisAnalysis►Typically exploratory, not confirmatoryTypically exploratory, not confirmatory►Often focused on simplificationOften focused on simplification►Often focused on revealing structure in Often focused on revealing structure in
dimensions that our eyes and dimensions that our eyes and imaginations don’t fully support.imaginations don’t fully support.
3
Adequate Preparation?Adequate Preparation?
►Basic course in statistical science Basic course in statistical science ►STA 671STA 671►SAS exposureSAS exposure►Linear algebra (?)Linear algebra (?)
4
Begin Reviewing and Begin Reviewing and ReadingReading►Basic data steps in SASBasic data steps in SAS►Chapter 1 in AMDChapter 1 in AMD►Chapter 2 in AMDChapter 2 in AMD►We’ll begin with Chapter 4We’ll begin with Chapter 4
5
Potential Topics CoveredPotential Topics Covered
►Principal Components Analysis (PCA)Principal Components Analysis (PCA)►Factor Analysis (FA)Factor Analysis (FA)►Discriminant Analysis (DA)Discriminant Analysis (DA)►Multidimensional Scaling (MDS)Multidimensional Scaling (MDS)►Cluster Analysis (CA)Cluster Analysis (CA)►Canonical Correlations Analysis (CCA)Canonical Correlations Analysis (CCA)►Multivariate Analysis of Variance Multivariate Analysis of Variance
(MANOVA)(MANOVA)
6
Why Multivariate?Why Multivariate?
►Typically more than one measurement Typically more than one measurement is taken on a given experimental unitis taken on a given experimental unit
►Need to consider all the Need to consider all the measurements together so that one measurements together so that one can understand how they are relatedcan understand how they are related
►Need to consider all the Need to consider all the measurements together so that one measurements together so that one can extract essential structurecan extract essential structure
7
0 10 20 30 40 50 60 70 80 90 100 0
2
4
6
8
10
12
14
Isomer/Congener P
erc
en
t o
f S
ub
sta
nc
e
Ty pical Chromatogram f or PCB
In ChromatographyIn Chromatography
one observation
8
0 20 40 60 80 100 120 140 160 180 -3
-2
-1
0
1
2
3
4
Time
Ma
gn
itu
de
MRI Time Series f or a Single Voxel
In NeuroimagingIn Neuroimaging
one observation
9
In Social Science ResearchIn Social Science Research
• Education level
• Your opinion on welfare
• Your opinion on social security
• Your opinion on ….
one (joint) observation
10
Distinguishing MidgesDistinguishing Midges
► Suppose we are interested in measuring Suppose we are interested in measuring the wing length and the antenna length.the wing length and the antenna length.
11
Distinguishing MidgesDistinguishing Midges
►What can you do with both variables What can you do with both variables that you can’t do with just one of that you can’t do with just one of them?them?
1.1 1.2 1.3 1.4 1.5 1.6 1.6
1.65
1.7
1.75
1.8
1.85
1.9
1.95
2
2.05
2.1 Coordinate Axis Projections
Antenna Length
Win
g L
en
gth
Species Af
Species Apf
1.1 1.2 1.3 1.4 1.5 1.6 1.6
1.65
1.7
1.75
1.8
1.85
1.9
1.95
2
2.05
2.1 Relabeling the Coordinates
Antenna Length
Win
g L
en
gth
Species Af
Species Apf
12
Measuring HeadsMeasuring Heads
►Are these data truly two-dimensional?Are these data truly two-dimensional?
120 125 130 135 140 145 150 155 110
115
120
125
130
135 Scatterplot of LTN vs. LTG
LTG
LTN
120 125 130 135 140 145 150 155 110
115
120
125
130
135 Scatterplot of LTN vs. LTG
LTG
LTN
Not the usual regression line ….
13
Our Approach in STA 677Our Approach in STA 677
►EmphasizeEmphasize IntuitionIntuition SASSAS GeometryGeometry InterpretationsInterpretations Data AnalysisData Analysis
►De-emphasizeDe-emphasize Theoretical Theoretical
basisbasis Formal proofsFormal proofs
14
Getting on the Computers Getting on the Computers HereHere
Logon ID Name Password beckap "Patrick James Becka" Your Social Security Number bolglal "Lori A. Bolgla" Your Social Security Number hartm "Mary Klugh Hart" Your Social Security Number holmesh "Heidi Harriman Holmes" Your Social Security Number houj "Jiang Hou" Your Social Security Number knottc "Carrie Ann Knott" Your Social Security Number pearced "Dennis Eugene Pearce" Your Social Security Number saphangtt "Thatsaka Saphangthong" Your Social Security Number seeleym "Matthew Kirk Seeley" Your Social Security Number wilsons "Shea A Wilson" Your Social Security Number wuq "Qun Wu" Your Social Security Number yangs "Shengming Yang" Your Social Security Number
15
Personal SAS LicensePersonal SAS License
►Lorinda Wang►[email protected]►SStars Lab►213d M I King 0039►Phone 859 257-2204►Fax 859 323-1266
16
Organizational DetailsOrganizational Details
►Please get the textbook (required)Please get the textbook (required)►Look at Readme.txt on the text CDLook at Readme.txt on the text CD►Notes posted on the class websiteNotes posted on the class website►Take a look at the syllabusTake a look at the syllabus
17
Basic VocabularyBasic Vocabulary
►Variance Variance ►CovarianceCovariance►CorrelationCorrelation
More than one kind of variability will emerge.
18
Additional VocabularyAdditional Vocabulary
►EigenvaluesEigenvalues►EigenvectorsEigenvectors►ProjectionsProjections►Matrix NotationMatrix Notation
19
Discovering Linear Discovering Linear CombinationsCombinations► Log on to the computer in front of you and Log on to the computer in front of you and
access our course web site.access our course web site.
► Find the data set helmet.xls and open it.Find the data set helmet.xls and open it.
► Compute (.707*LTN)+(.707*LTG) (use Compute (.707*LTN)+(.707*LTG) (use Excel)Excel)
► What did you just do geometrically?What did you just do geometrically?
20
Discovering Linear Discovering Linear CombinationsCombinations
-20 -15 -10 -5 0 5 10 15-20
-15
-10
-5
0
5
10
15Helmet Data
LTG
LT
N
Equal WtsOn LTG, LTN
LTG is WTD > LTN
21
Discovery Exercise ContinuedDiscovery Exercise Continued
► Find the variance of LTN, LTG (use Excel).Find the variance of LTN, LTG (use Excel).
► Find the variance of (.707*LTN)+(.707*LTG) --- Find the variance of (.707*LTN)+(.707*LTG) --- equal weights.equal weights.
► Find the variance of (.50*LTN)+(.85*LTG) --- Find the variance of (.50*LTN)+(.85*LTG) --- unequal weights with LTG weighted more.unequal weights with LTG weighted more.
22
Discovery ExerciseDiscovery Exercise
► What did you find and does it make sense?What did you find and does it make sense? Var(LTN)= 15.37Var(LTN)= 15.37 Var(LTG)= 31.84Var(LTG)= 31.84 Var(707)= 38.11Var(707)= 38.11 Var(5085)= 39.19Var(5085)= 39.19
► This is no accident. And this is what This is no accident. And this is what Principal Components is all about.Principal Components is all about.
23
Encounter With SASEncounter With SAS
►Save the helmet file to your hard disk.Save the helmet file to your hard disk.
►Exit Excel and start up SAS.Exit Excel and start up SAS.
►Watch the demonstration on how to Watch the demonstration on how to bring the Excel file into SAS.bring the Excel file into SAS.
►Repeat this yourself.Repeat this yourself.
24
Encounter With SASEncounter With SAS► It is easy to transfer the AMD .txt data It is easy to transfer the AMD .txt data
to Excel files. If you don’t know how to Excel files. If you don’t know how and want to know, just ask.and want to know, just ask.
►So you can always bring your data in So you can always bring your data in as Excel files if you want. as Excel files if you want.
►That is what I’ll do in front of the class.That is what I’ll do in front of the class.
25
Coming UpComing Up
Principal Components Analysis