Validity
Transcript of Validity
It is “the degree to which a certain inference from a test is appropriate or meaningful” (Drummond, 2000)
It is the extent to which a test does the job desired of it; the evidence may be either empirical or logical (Lyman, 1991)
It is the extent to which a test measures what it is supposed to measure (Murphy & Davidshofer, 1998)
Types Purpose Procedure Types of Tests
Content To compare whether the test items match the set of goals and objectives
Compare test blueprint with the school, course, program objectives. Use panel of experts in content area (eg teachers, professors)
Survey achievement tests, Criterion-referenced tests, examinations
Types Purpose Procedure Types of Tests
Criterion:Predictive
To determine whether there is a relationship between a test and a criterion measure to be obtained in the future
Correlate test scores with criterion measure obtained after a period of time
Scholastic aptitude, General aptitude batteries, Prognostic tests, Readiness tests, Personality tests
Types Purpose Procedure Types of Tests
Construct To determine whether a construct exists and to understand the traits or concepts that make up the set of scores or items
Conduct multivariate statistical analysis, discriminant analysis, multivariate analysis of variance
Intelligence tests, aptitude tests, personality tests
It refers to the degree to which test scores are consistent, dependable or repeatable; it is the function of the degree to which test scores are free from errors (Drummond, 2000)
It refers to the consistency of test scores obtained by the same persons when reexamined with the same test on different occasions, or with different sets of equivalent items, or under other variable examining conditions (Anastasi and Urbina, 1997).
The concept of reliability underlies the error of measurement of a single score whereby we can predict the range of fluctuation likely to occur in a single individual’s score as a result of irrelevant chance factors.
The other concept of reliability refers to the consistency of a test based on the number of items in the test and the average inter correlations among all items and computing the average of these inter correlations among test items.
Method Procedure Coefficient Problems
Test-retest Same procedure given twice with time interval testing
Stability Memory effectPractice effectChange over time
Alternate forms Equivalent tests given with time between testing
Equivalence and stability
Hard to develop 2 equivalent testsMay reflect change in behavior over time
Method Procedure Coefficient Problems
Internal Consistency
One test given at one time only (test divided into part in split-half)
Equivalence and internal consistence
Uses shortened forms (split half) Only good if traits are unitary or homogenousGives high estimate on a speeded testHard to compute by hand