Assessing the Content Validity of the EQ-5D Questionnaire ...
Questionnaire Validity
Transcript of Questionnaire Validity
-
7/29/2019 Questionnaire Validity
1/6
Questionnaire ValidityThe validity of a questionnaire relies first and foremost on reliability. If the questionnaire cannot
be shown to be reliable, there is no discussion of validity.
But there is good news. Demonstrating validity is easy, compared to reliability. If you have
reached this point and have a reliable instrument for measuring the issues or phenomena you are
after, demonstrating its validity will not be difficult.
Validity refers to whether the questionnaire or survey measures what it intends to
measure. While there are very detailed and technical ways of proving validity that are beyond
the level of this discussion, there are some concepts that are useful to keep in mind. The
overriding principle of validity is that it focuses on how a questionnaire or assessment process is
used. Reliability is a characteristic of the instrument itself, but validity comes from the way the
instrument is employed.
The following ideas support this principle:
As nearly as possible, the data gathering should match the decisions you need to
make. This means if you need to make a priority-focused decision, such as allocating
resources or eliminating programs, your assessment process should be a comparative one
that ranks the programs or alternatives you will be considering. Gather data from all the people who can contribute information, even if they are
hard to contact. For example, if you are conducting a survey of customer service, try to
get a sample of all the customers, not just those who are easy to reach, such as those who
have complained or have made suggestions.
A perfect example of a questionnaire that may
have high
reliability, but poor validity is a standardized
questionnaire that has been used in hundreds of
companies. These instruments are marketed
aggressively using promises of "industry norms" to compare your results with. Weigh
carefully the value of such comparisons against the almost certain lack of fit with your
culture, philosophy and way of managing. A good diagnosis of your organization is not
likely to come from a generic instrument with lots of normative comparisons.
If you're going after sensitive information, protect your sources. It has been said
that in the Prussian army at the turn of the century, decisions were made twice, once
when officers were sober, again when they were drunk. This concept acknowledges the
power of the "socially acceptable response" to questions or requests. Don't assume that a
simple statement printed on the questionnaire that "all individual responses will be kept
confidential" will make everybody relax and provide candid answers. Give respondents the
freedom to decide which information about themselves they wish to withhold, and employ
other administrative procedures, such as handing out Login IDs and Passwords separately
from the e-mail inviting people to participate in the survey.
Questionnaire Validity
A good diagnosis ofyour organization is not
likely to come from ageneric instrument...
-
7/29/2019 Questionnaire Validity
2/6
The validity of a questionnaire relies first and foremost on reliability. If the questionnaire cannot
be shown to be reliable, there is no discussion of validity.
But even reliable instruments may not be valid if they are employed for situations they were not
designed for. A good example of a questionnaire that may have high reliability, but poor validity is
a standardized questionnaire that is used over and over in hundreds of companies. These
instruments are marketed aggressively using promises of "industry norms" to compare your
results with. Validity is not a characteristic of a particular instrument, attached to it in a
way that ensures it will always produce accurate information no matter where or when
it is used. If you want validity, you have to be able to demonstrate validity in your situation; it is
not built into the instrument.
But there is good news. Demonstrating validity is relatively straightforward, compared to
reliability. If you have reached this point and have a reliable instrument for measuring the issues
or phenomena you are after, demonstrating its validity will not be difficult.
How Do We Measure Validity?
While there are detailed and technical ways of establishing validity that are beyond the level of this
discussion, the following are brief descriptions of the three basic approaches. All proofs of validity
employ one or more of these methods:
Content Validity If the content of a test or instrument matches an actual job or
situation that is being studied, then the test has content validity. For example, a Training
Needs Assessment for middle managers should have content (such as skills, activities and
abilities) relevant to the jobs of middle managers. Skills that pertain to landscaping
workers would not be appropriate in a needs assessment instrument for managers. Predictive ValidityThis form of validity comes from an instruments ability to predict
an outcome or event in the future. If a questionnaire or instrument is developed to assess
promotion potential of a group of newly hired workers, the results of the test should be
able to predict which of the group will actually be promoted. The predictive validity of the
instrument is shown in the correlation between the scores from the test and the persons
promoted.
Construct Validity This form of validity derives from the correlation between the test or
questionnaire and another instrument or process that measures the same construct. The
Myers-Briggs Type Indicator (MBTI) is a well-established test of personality types. A new
instrument developed to assess the same characteristics would have construct validity ifthe scores from the new instrument correlated highly with the scores from the MBTI.
These are the methods for proving validity. Providers of questionnaires and surveys who are
unable or unwilling to talk about validity in those terms (content, predictive, or construct)
should be avoided. You will sometimes hear discussions of face validity. Despite the use of the
term, face validity is not a form of validity assessment . It is simply a subjective appraisal of
how an instrument appears to a person who examines it. There are numerous examples of valid
instruments without face validity and completely bogus instruments with loads of face validity.
-
7/29/2019 Questionnaire Validity
3/6
Reliability and Validity in Questionnaire
DesignIn todays world organisations need strategic goals and targets and clear measurements are neededto assess progress towards these goals. Some of these targets are easy to define and themeasurements are clear cut, particularly certain financial goals, production and quality controltargets. However some of the most vital aspects of a well-functioning organisation are more complexto measure. For example, the climate and culture of an organisation is known to be central tooptimising employee wellbeing, productivity and innovation. Similarly, it is important to selectexecutives or employees with certain character traits and dynamics for them to function effectively intheir roles. Unlike annual income or production, which can be directly measured, many of thepsychological aspects of an organisation are intangible constructs and can only be measuredindirectly.
The classic example of an intangible construct is Intelligence Quotient (IQ). Most of us agree thatthere is such a thing as intelligence and that some people have more of it than others! But unlikeheight or weight it cant be measured with a tape-measure or a set of bathroom scales.
Figuring Out What You Want To Measure
Often the first step in measuring an intangible construct is coming up with an Operational Definition.This means defining what the construct is, what its comprised of and what measures it. This stage
tends to include a review of previous research on the topic to identify what is known about the subjectand how people have tried to measure it in the past.
In this type of work, our clients usually have a model of what makes up their construct, or we can help
them develop one. As a fictitious example, they might want to measure Organisational Effectivenessand they hypothesise that it is made up of four organisational traits: Morale, Innovation, Managementand Teamwork. In this case, each of the four traits needs to be measured. Questionnaires aregenerally used to collect this type of information. For example, a good design might be aquestionnaire with six questions each about each of the traits. The responses from the six questionsabout each trait will later be aggregated to give a measurement of Morale, Innovation, Managementand Teamwork.
After defining the construct and its components (traits), and producing questions to measure each ofthese, a testing stage is strongly recommended. The aim of testing is to ensure that the questionsare measuring what they are intended to: that is that they produce a reliable and valid measurement.
Reliability
Reliability means the consistency or repeatability of the measure. This is especially important if themeasure is to be used on an on-going basis to detect change. There are several forms of reliability,including:
Test-retest reliability - whether repeating the test/questionnaire under the same conditionsproduces the same results; and
Reliability within a scale - that all the questions designed to measure a particular trait areindeed measuring the same trait.
Validity
Validity means that we are measuring what we want to measure. There are a number of types ofvalidity including:
-
7/29/2019 Questionnaire Validity
4/6
Face Validity - whether at face value, the questions appear to be measuring theconstruct. This is largely a common-sense assessment, but also relies on knowledge ofthe way people respond to survey questions and common pitfalls in questionnaire design;
Content Validity - whether all important aspects of the construct are covered. Cleardefinitions of the construct and its components come in useful here;
Criterion Validity/Predictive Validity - whether scores on the questionnaire successfully
predict a specific criterion. For example, does the questionnaire used in selectingexecutives predict the success of those executives once they have been appointed; and
Concurrent Validity - whether results of a new questionnaire are consistent with results ofestablished measures.
Validating a Model
Going back to our hypothetical example, the client has a model of Organisational Effectiveness that ismade up of four organisational traits: Morale, Innovation, Management and Teamwork. They alsohave a questionnaire with questions that are intended to measure each of these traits. However, asthey are using the questionnaire to infer levels of Morale, Innovation, Management and Teamwork, itis important to assess whether the results are consistent with this model being accurate. There are anumber of statistical methods available to test whether the data collected using the questionnairesupports the model, or whether either the questionnaire or the model needs revision or development.Principal components analysis and exploratory or confirmatory factor analysis are among thestatistical techniques often used to assess a model.
These techniques can often provide a deeper understanding of the issues being surveyed, and canreveal that questions are measuring more or less than they were intended to. For example, manyyears ago Data Analysis Australia staff were assisting a client with survey data relating tooccupational health and safety (OHS) issues. One of the questions might be paraphrased as Mysupervisor puts my health and safety above productivity, which was created to measure OHS issues.However, analysis revealed responses to this question instead related mainly to the first words mysupervisor, and showed more about industrial relations than OHS.
Another benefit of using techniques such as factor analysis to assess a questionnaire is improvedefficiency. We are often able to advise clients on ways in which they can reduce the length of theirquestionnaires while maintaining or increasing the information that can be obtained. Reducing thenumber of questions in an overly lengthy questionnaire makes it easier for respondents to complete,and increases response rates.
Generalisability and Confounding Issues
In testing the questionnaire, the test sample is also important. For example IQ tests were usedincorrectly in the US many years ago on migrants with limited English in this case they receivedpoor scores, but the test was inadvertently measuring their ability to read and respond to a test writtenin English rather than their actual IQ. There are two important lessons that can be taken from thisexample. The first is that other issues that alter our results can pop up in research if we dont give
sufficient thought to what we are really measuring. As in the OHS example earlier, even a questionthat appears fine on the surface can be confounded by other issues in some cases.
The second lesson is to be cautious in generalising results to other groups. If a questionnaire isdesigned for a specific group it is important to test it on a representative group. A questionnaire thatwill be used for assessing Board members should be tested on current/prospective Board members ifthese are the people that the information is required for. If the questionnaire is to be used on manydifferent groups of people, its important to test it on the different groups it will be used for to ensure itis valid in all its intended usages.
Which of These Issues Do I Need to Consider For My Questionnaire?
The type of reliability and validity issues that need to be considered vary from one situation to the
next, depending on what the questionnaire is measuring and its intended use. There are a range ofstatistical procedures designed to test reliability and validity. In addition specific survey designs may
-
7/29/2019 Questionnaire Validity
5/6
be necessary to ensure that the required information is available to establish some of the morecomplex types of validity or reliability.
A number of Data Analysis Australias clients work in specialist areas in which a small number ofrigorously tested survey products form their core business. For these questionnaires in particular,attending to issues of reliability and validity is important to ensure their products are of a high quality.
Ongoing research and development of the survey products allows clients to maintain an edge in themarketplace.
For simpler surveys where a questionnaire is gathering information that only needs to be used in apractical way rather than inferential way, the reliability and validity requirements are more basic.However, even in these situations, it is important to make sure consideration is given to whether thesurvey is measuring what it should be.
Personality Questionnaire Validity and Reliability
Synopsis
December, 1998
Overall: We estimate that the Personality Questionnaire will indicate an English-speaking adults
personality type accurately 85% of the time in a non-controlled (i.e. over the internet) environment.
Resulting types are repeatable 75% of the time in a non-controlled environment, and 95% of the
time in a controlled environment. Unless otherwise noted, these statistics were generated from a
set of 100,000 subjects.
1st
Validation Technique: Best ApproachMethod. This method incorporates Content and
Criterion-related validity assessment. Best Approach was used during the development of the
Personality Questionnaire to ensure that we were starting with a good basic indicator. We were
assured of content validity by ensuring that the creator of the Personality Questionnaire was an
expert on psychological type, and on determining which behaviors were attributed to which
personality functions and attitudes.
2nd
Validation Technique: Comparison Method. This method was used during first-phase and
second-phase testing and validation of the Personality Questionnaire. It was used primarily to
validate the end-results of the Personality Questionnaire. We compared Personality Questionnaire
results against the results of other well-known instruments, namely the MBTI and Keirseys
Temperament Sorter. Our goal was to produce the same type as these comparable indicators atleast 75% of the time in an uncontrolled environment. We released the questionnaire once we
achieved this goal. Another revision after release brought us up to 85% matching.
3rd
Validation Technique:Averages Method. We used this method to validate the individual
questions that make up the Personality Questionnaire. This was used periodically throughout the
implementation and revision of the questionnaire.
For this method, we checked that answers to specific questions fell within ten percentage points of
expected norms. Expected norms of the general population are as follows: 60% Extraverted, 40%
Introverted, 75% Sensing, 25% Intuitive, 50% Thinking, 50% Feeling, 50% Judging, 50% Perceiving.Using 5000 questionnaire results, we checked that each question rendered results within 10% points
-
7/29/2019 Questionnaire Validity
6/6
of these expected norms. If the question did not meet these standards, it was revised and re-tested
until it did.
Reliability Technique: All reliability data was determined via Repetition.