Types of variables p. 2-1swcheng/Teaching/stat5230/lecture/...Types of variables p. 2-2 variable...
Transcript of Types of variables p. 2-1swcheng/Teaching/stat5230/lecture/...Types of variables p. 2-2 variable...
p. 2-1
• response-explanatory distinction (dependent-independent, response-predictor)
response variables: regarded as random; explanatory variables: regarded as deterministic
causal relationship? not necessary
• continuous-discrete distinction
according to whether a variable can take any values within an interval (yes continuous; no discrete)
Q: does continuous data really exist in the real world?
a better approach from the viewpoint of data analysis : according to the number of values a variable can take within some range in which most data often appear
Types of variables
p. 2-2
variable take lots of values continuous (e.g., test scores); few values discrete
Q: Poisson data (infinite possible values) should be treated as discrete or continuous from data analysis viewpoint?
quantitative-qualitative distinction
continuous variable must be quantitative
discrete variable could be quantitative/qualitative/both
• discrete (categorical) variables can be further classified into:
nominal variable: no natural ordering between categories (e.g.,religious affiliation, mode of transportation, favorite type of music, …)
values that represent categories have no numeric meaning
NTHU STAT 5230, 2011 Lecture Notes
made by Shao-Wei Cheng (NTHU)
p. 2-3
ordinal variable: there exist some ordering between categories (e.g., size of automobile, social class, political philosophy, patient condition, …)
distance between ordered categories are unknown
discrete interval variable: have numerical distances between any two values (e.g., functional life length of television set, length of prison term, …)
Sometimes, it is the way that a variable is measured determined its classification, e.g., education:
nominal when measured as public/private school
ordinal when measured as none/high school/bachelor/…
discrete interval variable when measured by # of years
p. 2-4
hierarchy of measurement scale: discrete interval variable (highest) > ordinal > nominal (lowest)
statistical methods for variables of one type can be used with variables at higher level, but not at lower levels
nominal variable qualitative; discrete interval variable quantitative; ordinal variable both (fuzzy)
• choice of statistical method/model for different types of variables –a rough classification:
only response variables, no explanatory variable
one response variable uni-variate analysis
more than one response multi-variate analysis
NTHU STAT 5230, 2011 Lecture Notes
made by Shao-Wei Cheng (NTHU)
p. 2-5
both response and explanatory variables
response: regarded as random variable
continuous & normal linear model
continuous but not normal (including exponential family, such as Weibull, Gamma, …) generalized linear model
discrete generalized linear model
explanatory: regarded as deterministic
continuous: coded using polynomial or other continuous transformations
discrete: coded using dummy variables• some examples (from Agresti, 2002):
p. 2-6
NTHU STAT 5230, 2011 Lecture Notes
made by Shao-Wei Cheng (NTHU)