Learning Structural SVMs with Latent Variables Xionghao Liu.
Measuring abstract concepts: Latent Variables and Factor Analysis.
-
Upload
mildred-copeland -
Category
Documents
-
view
215 -
download
2
Transcript of Measuring abstract concepts: Latent Variables and Factor Analysis.
![Page 1: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/1.jpg)
Measuring abstract concepts: Latent Variables and Factor Analysis
![Page 2: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/2.jpg)
Correlation as the shape of an ellipse of plotted points
oo
oo
oo
oo
oo
oo
oo
o
oo o
oo
oooo
ooo o
o
o
o o
oo
oo
oo
oo
oo
oo
oo
o
oo o
oo
oooo
ooo o
o
o
o
oo
oo
oo
oo
oo o
oo
o
oo
o
oo
oo
oo
oo
oo
oo
oo
oo
o
oo o
oo
oooo
ooo o
o
oo
o
oo
oo
oo
oo o
oo
oo
oo
oo
o
ooo
o
o
o oo
High correlation(people’s arm &
leg lengths?)
Lower correlation(arm length & body weight?)
No correlation(arm length &
income?)
Correlation shows how accurately you can predict the score on a second variable if you are told the first. It suggests that
there may be some underlying connection: growth, for example.
![Page 3: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/3.jpg)
Multiple dimensions
We can show correlations among 3variables (e.g. lengthof your arm & leg, and headcircumference). If they are correlated, the diagram becomes an ellipsoid. It has a central axis runningthrough it, forming a singlesummary indicator of thelatent variable (size).
Mathematically, we can also summarizecorrelations betweenmore than 3 dimensions(but I can’t draw it)
Headcircum.
arm
leg
![Page 4: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/4.jpg)
General measurement approach
ϕ ψ social
Health
Conceptualmodel
Selection ofindicators(sampling)
Scoringsystem e.g. ϕ x 2 + Ψ x 1.2 + soc.
![Page 5: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/5.jpg)
Possible hierarchies
ϕ ψ social
ϕ ψ social
Health
Health
In a multi-level construct we need to specify how the different levels relate to each other.
This comes entirely from a conceptual approach:there is no empirical way to assert one or the other model.
? ?
![Page 6: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/6.jpg)
Indicators
Latent trait
Indicators inter-correlatebecause they all reflect the
latent trait
Indicators correlate becausethey reflect a common (latent) trait
![Page 7: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/7.jpg)
Modeling the link between manifest (measured) and latent (inferred) variables
Income
Expenditure
Indicator scores
Health (Probability model)
For variables likeincome & expenditure
we can give a relatively fixed model
For health thereis more variationbetween people,so a less precise
model.
![Page 8: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/8.jpg)
Principal Components analysis
• Translates a complex system of correlations between many variables into fewer underlying dimensions (or ‘principal components’).
• Developed by Charles Spearman in 1904 to identify a simpler underlying structure in large matrices of correlations between measures of mental abilities.
• Later greatly misused in ‘defining’ intelligence.
![Page 9: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/9.jpg)
Common variance(what we are trying
to measure)
Unique variance in thisitem (irrelevant biasin the measurement)
Spearman’s 1904 core idea: each item contains somecommon (shared) variance plus some specific variance.
The latter (red circles) sometimes raises and sometimes lowers the score, so they cancel out if you have enough items.
+
-
![Page 10: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/10.jpg)
One principal component
• Red lines show scores on 8 tests as vectors
• Cosine of angles between them represent correlations: if 2 vectors overlap the correlation is perfect (Cosine 0° = 1.0)
• Principal component 1resolves most of the variance in the 8 measures: it’s the bestfit, or grand average.
1
![Page 11: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/11.jpg)
Dimensionality & Rotation. The principal component is that which accounts for the most variance;this depends on the conceptual shape of the latent trait being measured.For Chile, one dimension will account for most of the variance in distancebetween cities; for HK a more complex model is required. To find the dominant dimension with the maximal variation, axes need to be rotated.
![Page 12: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/12.jpg)
Variance ‘explained’
Here 2 vectors, B & C, are only partially correlated.
Resolving power of the principal component is shown by comparing length of the vector (B or C) and its projection onto the axis (Bʹ, C ʹ)
Here, axis 1 ‘explains’ more variance for B than for C (Bʹ > C ʹ)
A second (horizontal) component may be required for C: axis 2 resolves much of the variance in C, but very little for B.
Principalaxis
BBʹ
CCʹ
Secondaxis
![Page 13: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/13.jpg)
Thurstone’s 1930 multi-factor idea: each item containssome common variance plus several types of unique variance.
The latter (colored circles) can compose an additional factor being measured, or just random ‘error’.
+
-
Common variance(what we are trying
to measure)
Unique variance in thisitem (irrelevant biasin the measurement)
Second theme inthe measurement
![Page 14: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/14.jpg)
Factor Loadings and Validity
In the second example, the latent variable is more strongly reflected in the item; it has a higher loading on the variableand is a purer indicator of the underlying variable.
The blue rectangle represents the contribution of the latent variableto the indicator. The green segment represents the contribution of other latent variables; the red section shows all other sources of variance(error, etc).
![Page 15: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/15.jpg)
Example of a two-factorsolution (here related toconcepts in the Health
Belief Model)
Source: K.S. Lewis, PhD thesis“An examination of the HealthBelief Model when applied to
Diabetes Mellitus”University of Sheffield, 1994.
![Page 16: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/16.jpg)
Solution with rotated axes
1
Anxietyitems
Depressionitems
Using factor 1 alone = general mental health factor?
1
Using 2 factors clarifies different groups,but neither explains substantial variance
2
Anxietyfactor
Depressionfactor
![Page 17: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/17.jpg)
To rotate or not to rotate?• Dimensions are traditionally shown perpendicular to each
other: independent & uncorrelated (measures of different things should not be confounded).
• Applied to example of anxiety & depression there are various options: 1. as they are both are both facets of mental distress, they could be
summarized along a single factor2. perhaps it is diagnostically useful to keep anxiety & depression
conceptually distinct: 2 orthogonal factors. If so, our indicators are not terrible good (low variance explained)
3. anxiety & depression often co-occur, so in reality are correlated; the axes could therefore be rotated obliquely to resolve the maximum variance (next slide)
![Page 18: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/18.jpg)
Oblique rotation
Allow the axes to correlate• Resolves more variance• But does not create
conceptually independent entities
Do you like this approach?
Anxietyfactor
Depressionfactor
![Page 19: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/19.jpg)
An example of turningprincipal componentsanalysis results into
linear modeling (LISREL),
The Health Belief Model.
Source: Cao Z-J, Chen Y, Wang S-M. BMC
Public Health 2014, 14:26
![Page 20: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/20.jpg)
Cautions to ponder…• Correlations between measures do not prove that they record
anything concrete. • Test scores may or may not result from (or be caused by) the
underlying factor. • The principal component is a mathematical abstraction; it may
not represent anything real– (correlate your age for successive years with the population of
Mexico, the weight of your pet turtle, the price of cheese and the distance between any 2 galaxies: this will produce a strong principal component).
– Rotating the axes causes the principal component to disappear, so it has no reality
• We cannot declare that a factor represents an underlying reality (intelligence or health, etc.) unless we have clear evidence from other sources.
![Page 21: Measuring abstract concepts: Latent Variables and Factor Analysis.](https://reader037.fdocuments.us/reader037/viewer/2022110320/56649cd75503460f9499e733/html5/thumbnails/21.jpg)
Questions to debate
• Would you use a 1- or a 2-factor solution for anxiety & depression questions?– What sort of rotation?
• What type of evidence could demonstrate that your presumed health measures really do measure health?
• Should we ever use oblique rotation?