Reliability or Validity

Reliability or Validity Reliability gets more attention: Easier to understand Easier to measure More formulas (like stats!) Base for validity


Reliability or Validity. Reliability gets more attention: Easier to understand Easier to measure More formulas (like stats!) Base for validity. Need for validity. Does test measure what it claims? Can test be used to make decisions?. Validity. - PowerPoint PPT Presentation

Transcript of Reliability or Validity

Page 1: Reliability or Validity

Reliability or Validity

Reliability gets more attention: Easier to understand Easier to measure More formulas (like stats!) Base for validity

Page 2: Reliability or Validity

Need for validity

Does test measure what it claims? Can test be used to make decisions?

Page 3: Reliability or Validity


Reliability is a necessary, but not a sufficient condition for validity.

Page 4: Reliability or Validity

Validity: a definition

“A test is valid to the extent that inferences made from it are appropriate, meaningful, and useful”

Standards for Educational and Psychological Testing, 1999

Page 5: Reliability or Validity

“Face Validity”

“looks good to me”!!!!!!!

Page 6: Reliability or Validity

Trinitarian view of Validity

Content (meaning) Construct (meaning) Criterion (use)

Page 7: Reliability or Validity

1) Content Validity

“How adequately a test samples behaviors representative of the universe of behaviors the test was designed to measure.”

Page 8: Reliability or Validity

Determining Content Validity

describe the domain specify areas to be measured compare test to domain

Page 9: Reliability or Validity

Content Validity Ratio (CVR)

Agreement among raters if item is: Essential Useful but not essential Not necessary

Page 10: Reliability or Validity

2) Construct validity

“A theoretical intangible”“An informed, scientific idea”

-- how well the test measures that construct

Page 11: Reliability or Validity

Determining Construct validity

behaviors related to constructs related/unrelated constructs identify relationships multi trait/multi method

Page 12: Reliability or Validity

Multitrait-Multimethod Matrix

Correlate scores from 2 (or more tests) Correlate scores obtained from 2 (or more)


Page 13: Reliability or Validity

Evidence of Construct Validity

Upholds theoretical predictions Changes (?) over time, gender, training

Homogeneity of questions (internal consistency, factor or item analysis)

Convergent/discriminant Multitrait-multimethod matrix

Page 14: Reliability or Validity

Decision Making

How well the test can be used to help in decision making about a particular criterion.

Page 15: Reliability or Validity

Decision Theory

Base rate Hit rate Miss rate False positive False negative

Page 16: Reliability or Validity

3) Criterion Validity

“The relationship between performance on the test and on some other criterion.”

Page 17: Reliability or Validity

Validity coefficient

Correlation between test score and score on criterion measure.

Page 18: Reliability or Validity

Two ways to establish Criterion Validity

A) Concurrent validityB) Predictive validity

Page 19: Reliability or Validity

Determining Concurrent validity

Assess individuals on construct Administer test to lo/hi on construct Correlate test scores to prior identification Use test later to make decisions

Page 20: Reliability or Validity

Determining Predictive validity

Give test to group of people Follow up group Assess later Review test scores If correlate with behavior later can use later

to make decisions

Page 21: Reliability or Validity

Incremental validity

Value of including more than one predictor Based on multiple regression What is added to prediction not present with

previous measures?

Page 22: Reliability or Validity

Expectancy data

Taylor-Russell Table Naylor-Shine Tables Too vague, outdated, biased

Page 23: Reliability or Validity

Unified Validity - Messick

“Validity is not a property of the test, but rather the meaning of the scores.”

Value implicationsRelevance and utility

Page 24: Reliability or Validity

Unitarian considerations

Content Construct Criterion Consequences

Page 25: Reliability or Validity

Threats to validity

Construct underrepresentation (too narrow) Construct-irrelevant variance (too broad)

construct-irrelevant difficulty construct-irrelevant easiness

Page 26: Reliability or Validity

Example 1

Dr. Heidi considers using the Scranton Depression Inventory to help identify severity of depression and especially to distinguish depression from anxiety. What evidence should Dr. Heidi use to determine if the test does what she hopes it will do?

Page 27: Reliability or Validity

Example 2

The newly published Diagnostic Wonder Test promises to identify children with a mathematics learning disability. How will we know whether the test does so or is simply a slickly packaged general ability test?

Page 28: Reliability or Validity

Example 3

Ivy College uses the Western Admissions Test (WAT) to select applicants who should be successful in their studies. What type of evidence should we seek to determine if the WAT satisfies its purpose?

Page 29: Reliability or Validity

Example 4Mike is reviewing a narrative report of his

scores on the Nifty Personality Questionnaire (NPQ). The report says he is exceptionally introverted and unusually curious about the world around him. Can Mike have any confidence in these statements or should they be dismissed as equivalent to palm readings at the county fair?

Page 30: Reliability or Validity

Example 5

A school system wants to use an achievement battery that will measure the extent to which students are learning the curriculum specified by the school. How should the school system proceed in reviewing the available achievement tests?

Page 31: Reliability or Validity

Example 6

Super sun computers needs to hire three new employees. They have decided to administer the Computer Skills Assessment (CSA) to their applicants and use the results as the basis of their decision. How can they determine if that measure is a good fit for their hiring practice?

Page 32: Reliability or Validity

Project homework question What content or construct is your measure

assessing? (explain your answer) What do you think congruent and discriminate

constructs would be to the one in your measure? How would you determine the content or construct

validity of your measure? How would you determine the criterion validity of

your measure? Why would you use those approaches?

Page 33: Reliability or Validity

Project homework question

Select a standardized instrument from MMY to use as a comparison for your measure?

Copy the relevant data. Why did you select that instrument? How would you use it to help standardize

your measure?

Page 34: Reliability or Validity