Measurement and Scales Validity & Reliability Error.
-
date post
22-Dec-2015 -
Category
Documents
-
view
234 -
download
3
Transcript of Measurement and Scales Validity & Reliability Error.
Measurement and Scales
Validity &
Reliability
Error
Measurement
Measurement and Measurement Scales
• Measurement• Process of assigning numbers or labels to
things in accordance with specific rules to represent quantities or qualities of attributes.
• Rule: A guide, method, or command that tells a researcher what to do.
• Scale: A set of symbols or numbers constructed to be assigned by a rule to the individuals (or their behaviors or attitudes) to whom the scale is applied.
Types of Measurement Scales
• Nominal Scales• Scales that partition data into mutually
exclusive and collectively exhaustive categories.
• Ordinal Scales• Nominal scales that can order data.
• Interval Scales• Ordinal scales with equal intervals between
points to show relative amounts; may include an arbitrary zero point.
• Ratio Scales• Interval scales with a meaningful zero point so
that magnitudes can be compared arithmetically.
Nominal
Ordinal
Interval
Ratio
Win Place Show
1 length 2 lengths
40 to 1 long-shot pays $40
Type of Scale Numerical Operation Descriptive Statistics
Nominal Counting Frequency;
Percentage; mode
Ordinal Rank ordering (plus…)Median
Range; Percentile
Interval Arithmetic operations on intervals bet numbers
(plus…) Mean;
Standard deviation;
variance
Ratio Arithmetic operations on actual quantities
(plus…) Geometric mean; Co-efficent of variation
Selecting appropriate univariate statistical method
Scale Business Problem
Statistical question to be
asked
Possible test of statistical significance
Nominal Scale Identify sex of key executives
Is the number of female executives equal to the number of males executives?
Chi-square test
Scale Business Problem
Statistical question to be
asked
Possible test of statistical significance
Nominal Scale Indicate percentage of key executives who are male
Is the proportion of male executives the same as the hypothesized proportion?
T-test
Scale Business Problem
Statistical question to be
asked
Possible test of statistical
significance
Ordinal scale Compare actual and expected evaluations
Does the distribution of scores for a scale with categories of poor,good, excellent differ from an expected distribution?
Chi-square test
Scale Business Problem
Statistical question to be
asked
Possible test of statistical
significance
Interval or Ratio scale
Compare actual and hypothetical values of average salary
Is the sample mean significantly different from the hypothesized population mean?
Z-test (sample is large)
T-test (sample is small)
30/10/02 12
Error in Survey Research
Random Sampling Error (Random error) Error that results from chance variation Impact can be decreased by increasing
sample size and through statistical estimation (confidence interval) or “rule of thumb”
Systematic Error (non sampling error) Error that results for the research design or
execution.
R an d om E rro r
N on resp on se e rro r
D e lib era te F a ls ifica tion U n con sc iou s m is rep resen ta tion
A cq u iescen ce E xtrem ity b ias In te rviewer b ias A u sp ices b ias soc ia l d es irab ility
R esp on se b ias
R esp on d en t e rro r
D a ta P rocess in g S am p le se lec tion
In te rviewer ch ea tin g In te rviewer e rro r
A d m in is tra tive e rro r
S ys tem atic E rro r
Tota l E rro r
30/10/02 14
Types of Systematic Error
1. Administrative Error Error that results from improper execution.
Data Processing Error Quality of data depends on quality of data
entry. Use of verification procedures can minimize
30/10/02 15
Sample Selection Error Systematic error resulting from improper
sampling techniques either in design or execution.
Interviewer Error Data recorded incorrectly (error or selective
perception). Interviewer Cheating
Mitigate by random checks
30/10/02 16
2. Respondent Error Humans interviewing humans...
Non-response error Statistical difference between a survey that
includes only those who responded and a survey that also includes those who failed to respond.
Non-respondent: person not contacted or who refuses to participate
Self selection bias: extreme positions represented
30/10/02 17
Response bias Errors that result from tendency to answer in “a
certain direction”. Conscious or unconscious misrepresentation
Types: 1. Deliberate falsification (why?)
30/10/02 18
Why would people deliberately falsify data Appear to be what they are not Don’t trust confidentiality Protect To end the interviewer quicker “Average man effects”
30/10/02 19
Types of response bias continued: 1. Deliberate falsification 2. Unconscious misrepresentation
30/10/02 20
Reasons for unconscious misrepresentation: Question format Question content Misunderstanding of question leading to biased
answer Lack of time to consider answer fully Communication or semantic confusion other
30/10/02 21
Types of response bias Acquiescence bias: individuals have a tendency
to agree or disagree with all questions or to indicate a positive/negative connotation
Extremity bias: results for response styles varying from person to person; some people tend to use extremes when responding to questions
30/10/02 22
Types of response bias continued... Interviewer bias: Bias in the responses of the
subject due to the influence of the interviewer Auspices bias: respondents being influenced
by the organization conducting the study Social desirability bias: caused by
respondents’ desire, either consciously or unconsciously to gain prestige or to appear in a different social role
30/10/02 23
• Reliability, Validity and Correlation are concepts which are easy to confuse because the numbers used to represent them are so
similar • This is because Validity and Reliability are largely based on the Correlation Statistic
• Validity and Reliability are closely related
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
30/10/02 24
What is Correlation?• It is one way to measure the relationship between two variables
• It answers questions like: Is the relationship linear (straight-line)?
Does the value of y depend upon the value of x or vice versa? How strong is the relationship, do the points form a perfect line?
• To measure the relationship we calculate the Correlation Coefficient
• Misconceptions: An insignificant result doesn’t mean there is no relationship, it is just not linear.
The Correlation Coefficient does not measure the slope of the relationship
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
30/10/02 25
The Correlation Coefficient• The Correlation Coefficient has the following attributes:
It can take a value in the range of -1 to +1
It is dimension less, i.e. its value is independent of the units of y and x
Its value is independent of the measurement scales of x and y
• Methods to measure the correlation areSpearman (r) rho (nonparametric, ordinal data)
Kendall Tau Correlation (nonparametric, ordinal data)
Pearson’s (Product Moment) Correlation (parametric, interval or ratio data)
• Examples of values of Correlation Coefficient (r):
= +1 =0 ≈ -0.6
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
30/10/02 27
What is Validity?• Validity is concerned with whether we are measuring what we say we are measuring• A measure is valid when the differences in observed scores reflect true differences on the characteristics one is attempting to measure and nothing else.
X0=XT
• There are different kinds of validity• Most of these use the correlation coefficient as a measure
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
30/10/02 28
What is Reliability? •A Measure is reliable to the extent that independent but comparable measures of the same trait or construct of a given object agree.• In research, the term reliability means "repeatability" or "consistency"• Reliability is a necessary but not sufficient condition for validity• A test is said to be reliable if it consistently yields the same results• Example:
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
For instance, if the needle of the scale is five pounds away from zero, I always over-report my weight by five pounds. Is the measurement consistent? Yes, but it is consistently wrong! Is the measurement valid? No! (But if it under-reports my weight by five pounds, I will consider it a valid measurement)
Types of Validity?
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
Predictive Validity (Criterion Related)
Test scores should corr. with real-world outcomes
GMAT scores predict university success
Convergent Validity
Test should correlate with other similar measures
GMAT should correlate with other academic ability tests
Discriminant Validity
Test should not corr. with irrelevant tests
GMAT should not corr. with political attitudes
Face Validity Items look like they are covering proper topics
Math test should not have history items
Construct Validity Construct validity can be measured by the correlation between the intended independent variable (construct) and the proxy independent variable (indicator, sign) that is actually used.
30/10/02 30
Validity vs. Reliability? • There are different conceptions of the relationship of Validity and Reliability which developed over time •If a measure is valid it is also reliable?•Illustrative Example: Target Metaphor
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
30/10/02 31
Types of Reliability? • There are 4 types of Reliability:
• Inter-Rater or Inter-Observer ReliabilityUsed to assess the degree to which different raters/observers give consistent estimates of the same phenomenon.
• Test- Retest ReliabilityUsed to assess the consistency of a measure from one time to another.
• Parallel-Forms ReliabilityUsed to assess the consistency of the results of two tests constructed in the same way from the same content domain.
• Internal Consistency ReliabilityUsed to assess the consistency of results across items within a test.
sometimes referred to as homogeneity
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
30/10/02 32
Internal Consistency Reliability? • There are different ways to measure Reliability:
• Average Inter-item Correlation • Average Item-total Correlation • Split-Half Reliability • Cronbach's Alpha (α)
• May be used to describe the reliability of factors extracted from dichotomous (that is, questions with two possible answers) and/or multi-point formatted questionnaires or scales (i.e., Likert scales)• Cronbach's alpha measures how well a set of items (or variables) measures a single unidimensional construct• The theory behind it is that the observed score is equal to the true score plus the measurement error (Y = T + E)• A reliable instrument should minimize the measurement error so that
the error is not highly correlated with the true score
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
30/10/02 33
Cronbach’s Alpha Coefficient? • The Alpha Coefficient has the following attributes:
Alpha coefficient ranges in value from 0 to 1
The higher the score, the more reliable the generated scale Nunnaly (1978) has indicated 0.7 to be an acceptable reliability
coefficient but lower thresholds are sometimes used in the literature.It is a common misconception that if the Alpha is low, it must be a
bad test.
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
Sorting Respondent indicates their attitudes or beliefs
by arranging items.
Example: Please sort the following cards with pictures of cookies into the following categories
Like
Dislike
Neither like nor dislike
Ranking, sorting, rating or choice? How many categories or response positions? Balanced or unbalanced? Forced choice or nonforced choice? Single measure or index?
Decisions
Why these three concepts?
• Reliability, Validity and Correlation are concepts which are easy to confuse because the numbers used to represent them are so
similar • This is because Validity and Reliability are largely based on the Correlation Statistic
• Validity and Reliability are closely related
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
What is Correlation?• It is one way to measure the relationship between two variables
• It answers questions like: Is the relationship linear (straight-line)?
Does the value of y depend upon the value of x or vice versa? How strong is the relationship, do the points form a perfect line?
• To measure the relationship we calculate the Correlation Coefficient
• Misconceptions: An insignificant result doesn’t mean there is no relationship, it is just not linear.
The Correlation Coefficient does not measure the slope of the relationship
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
The Correlation Coefficient• The Correlation Coefficient has the following attributes:
It can take a value in the range of -1 to +1
It is dimension less, i.e. its value is independent of the units of y and x
Its value is independent of the measurement scales of x and y
• Methods to measure the correlation areSpearman (r) rho (nonparametric, ordinal data)
Kendall Tau Correlation (nonparametric, ordinal data)
Pearson’s (Product Moment) Correlation (parametric, interval or ratio data)
• Examples of values of Correlation Coefficient (r):
= +1 =0 ≈ -0.6
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
What is Validity?• Validity is concerned with whether we are measuring what we say we are measuring• A measure is valid when the differences in observed scores reflect true differences on the characteristics one is attempting to measure and nothing else.
X0=XT
• There are different kinds of validity• Most of these use the correlation coefficient as a measure
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
What is Reliability? •A Measure is reliable to the extent that independent but comparable measures of the same trait or construct of a given object agree.• In research, the term reliability means "repeatability" or "consistency"• Reliability is a necessary but not sufficient condition for validity• A test is said to be reliable if it consistently yields the same results• Example:
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
For instance, if the needle of the scale is five pounds away from zero, I always over-report my weight by five pounds. Is the measurement consistent? Yes, but it is consistently wrong! Is the measurement valid? No! (But if it under-reports my weight by five pounds, I will consider it a valid measurement)
Types of Validity?
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
Predictive Validity (Criterion Related)
Test scores should corr. with real-world outcomes
GMAT scores predict university success
Convergent Validity Test should correlate with other similar measures
GMAT should correlate with other academic ability tests
Discriminant Validity Test should not corr. with irrelevant tests
GMAT should not corr. with political attitudes
Face Validity Items look like they are covering proper topics
Math test should not have history items
Construct Validity Construct validity can be measured by the correlation between the intended independent variable (construct) and the proxy independent variable (indicator, sign) that is actually used.
Validity vs. Reliability? • There are different conceptions of the relationship of Validity and Reliability which developed over time • Churchill mentions in his article for example that if a measure is valid it is also reliable. (pg. 65) This view has been contradicted in the more recent literature (Moss, 1994, who mentions there can be validity without reliability)•Illustrative Example: Target Metaphor
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
Types of Reliability? • There are 4 types of Reliability:
• Inter-Rater or Inter-Observer ReliabilityUsed to assess the degree to which different raters/observers give consistent estimates of the same phenomenon.
• Test- Retest ReliabilityUsed to assess the consistency of a measure from one time to another.
• Parallel-Forms ReliabilityUsed to assess the consistency of the results of two tests constructed in the same way from the same content domain.
• Internal Consistency ReliabilityUsed to assess the consistency of results across items within a test.
sometimes referred to as homogeneity
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
Ways to measure Internal Consistency Reliability? • There are different ways to measure Reliability:
• Average Inter-item Correlation • Average Item-total Correlation • Split-Half Reliability • Cronbach's Alpha (α)
• May be used to describe the reliability of factors extracted from dichotomous (that is, questions with two possible answers) and/or multi-point formatted questionnaires or scales (i.e., Likert scales)• Cronbach's alpha measures how well a set of items (or variables) measures a single unidimensional construct• The theory behind it is that the observed score is equal to the true score plus the measurement error (Y = T + E)• A reliable instrument should minimize the measurement error so that
the error is not highly correlated with the true score
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity
Cronbach’s Alpha Coefficient? • The Alpha Coefficient has the following attributes:
Alpha coefficient ranges in value from 0 to 1
The higher the score, the more reliable the generated scale Nunnaly (1978) has indicated 0.7 to be an acceptable reliability
coefficient but lower thresholds are sometimes used in the literature.It is a common misconception that if the Alpha is low, it must be a
bad test.
Co
rrel
atio
n,
Val
idit
y an
d R
elia
bil
ity