Johnson Reading Mastery Tests-Revised (WRMT-R).doc

Generic Test Analysis Template

Hayward, Stewart, Phillips, Norris, & Lovell1

Test Interpretation:

Scoring is facilitated with the computer program ASSIST on CD-ROM (available for Macintosh or Windows) which allows the examiner to enter raw scores which are then converted to the statistical profile for the student. In addition to Standard Score, NCEs, age and grade equivalents, and percentile ranks, ASSIST provides Relative Performance Indexes, Confidence Band as 68% and 90% levels, Grade Equivalent and Standard Score/Percentile Rank profiles, Aptitude-Achievement Discrepancy Analysis, and a Narrative Report. Examples from the 1989 version are used to assist the examiner. A discrepancy analysis can be performed with the CD. Diagnostic profiles allow comparison of WRMT-R results with Goldman-Fristoe-Woodcock Sound-Symbol Tests and the Woodcock-Johnson Psychoeducational Battery.

No normative values are attached to the checklist scores but the author notes that the checklist is a diagnostic tool (see Form G test manual instruction section for the Supplemental Checklist). The author states that the checklist is intended to be used in instructional planning.

Comment: Buros reviewers cite several concerns regarding interpretation. The smaller norm sample means that we should be cautious particularly in interpreting scores for children who were underrepresented in the sample e.g., urban centres. The reviewers also note that the presence of new and old norms in the same manual and norm table is misleading at best (Crocker & Murray Ward, 2001, 1371).

Standardization: FORMCHECKBOX Age equivalent scores FORMCHECKBOX Grade equivalent scores FORMCHECKBOX Percentiles FORMCHECKBOX Standard scores

FORMCHECKBOX Other (Please Specify) Readiness Cluster (Form G only, consists of Visual-auditory Learning and Letter Identification), Basic Skills Cluster (Word Identification and Word Attack), Reading Comprehension Cluster (Word Comprehension and Passage Comprehension), Total Reading Full Scale (Word Identification, Word Comprehension, Passage Comprehension), and Total Reading Short Scale (Word Identification and Passage Comprehension) are available clusters. NCEs and Relative Performance Index and instructional ranges are provided.

Reliability:

Important note: The reliability reported only refers to the 1989 revision. No updated reliability with new norm sample was provided.

Internal consistency of items: Split-half median was .91 (range .68-.98) and was also reported for Clusters: median =.95 (range .87-.98) and Total median=.97 (range .86-.99)

Test-retest: No information found. Technical Information from Pearson site reported no.

Inter-rater: No information found. Pearson reports no.

Other (Please Specify):

Comment: I think that it is misleading to have mixed 1989 reliability in with re-normed version. Im not sure how doing so affects the psychometric basis.

Validity:

Important note: The validity information provided in the manual is from the previous 1989 revision. The information is unrelated to the 1997 norms. Also because assorted tests and subtests were used in norming/equating, this raises questions about the underlying trait being measured. No attempt was evident to make sure that the various authors were operationally defining their terms equivalently. The Buros reviewer states that this update still does not address a number of validity and interpretation problems cited in previous reviews (Crocker & Murray Ward, 2001, p. 1371). Specifically, the reviewer points out that the content domain remains undescribed. Also, as in the previous revision, the sources for the selection of items was not provided nor was there any indication of rationale for the words or skills provided.

The Buros reviews from the 1989 revised WRMT (Cooter & Jaeger, 1989) brought forth reliability and validity issues. I understand from the current reviews that these issues remain as the WRMT-R NU is a re-normed version only.

Content: Content validity, as it refers to WRMT-R, was developed with contributions from outside experts, including experienced teachers and curriculum specialists (Woodcock, 1998, p. 97). However, unlike other manuals reviewed for the TELL Project, Woodcock does not provide references in the manual alongside his statement.

Criterion Prediction (concurrent) Validity: Validity was reported for WRMT-R and WJ reading tests for children in Grades 1, 3, 5, and 8 across subtests and total reading scores. Scores range from a low of .39 (Passage Comprehension) to a high of .91 (Full Scale Total Reading). A 1978 study reported WRMT-1973 correlation with Iowa Test of Basic Skills, Iowa Tests of Educational Development ((total reading), PIAT Reading, WJ Reading Achievement, and WRAT Reading demonstrating scores from .79 to .92. The author justifies: Although these results are based on the 1973 WRMT, they are reported in this revision because the psychometric characteristics of the original WRMT (1973) and the WRMT-R are so similar that many generalizations from one to the other can be validly made (Woodcock, 1998, p. 100).

Construct Identification Validity: Test and Cluster Intercorrelations: Since tests were clustered to target readiness and skill areas, correlations are reported for subtests within clusters as well as clusters overall. Subtest and clusters presented predictable correlations.

Comment: Though all the data are presented in Table 5.6 (Woodcock, 1998, p. 97) for Grades 1, 3, 5, 8, 11 and College as well as Adult, I would have thought it appropriate that the author offer some comments or interpretation rather than leave that task to the reader.Differential Item Functioning: Classical and Rasch models were used in item development and selection though it is unclear from the statement on page 97. Woodcock states, both contributed to the stringent statistical criteria employed during the process of item selection in the WRMT-R (Woodcock, 1998, p. 97). The correlations range from low (.35 at Grade 3 for letter identification/visual-auditory learning) to high (.98 for Total reading short scale/total reading full scale).

Other (Please Specify): No reported studies investigating the predictive validity with special education populations were undertaken, though children with special needs did participate in norming.

Comment: With no predictive validity studies, we have little if anything on which to base placement decisions. Serious concern since this test is widely used by school districts for that very purpose and author promotes this use. Buros reviewer states: Scores for special education students should be used cautiously. Although the author did include special education and gifted students in the norm sample, matching their prevalence in the general population, their actual numbers are quite small. In addition, there are still no predictive validity studies to validate the WRMT-R NU with this population; a serious omission because this test is frequently used in placement and re-evaluation (Crocker & Murray Ward, 2001, p. 1372).

Summary/Conclusions/Observations:

The Buros reviewers make important comments:

three interpretation issues arise. First, the use of the norms generated from the smaller norm sample means that interpretations are limitedSecond, scores for special education students should be used cautiously[third]. The author states that comparisons of old and new norm data clearly show a pattern of lower performances and higher standard and percentile scores of lower achievers. This effect could result in overestimation of students reading levels. Thus, students might not receive appropriate services or services may be terminated prematurely. Interestingly, there are no cautions to examiners to readjust score referents to account for these changes (Crocker & Murray Ward, 2001, p. 1372).

It should also be remembered that no changes have been made in test skills or items, and there is no stipulation that the other measures clarify the meaning of the WRMT-R NU scores. In conclusion, the WRMT-R/NU is a limited norms update. The test still contains many test items and scores, but does not address problems identified by previous reviewers. Furthermore, the renorming has narrowed the utility of the test. Therefore, the WRMT-R/NU should be used in conjunction with other measures of reading. Results should not be overinterpreted. The examiner should also be very cautious in using the test with a wide range of age groups. If these cautions are observed, the test may be useful in helping estimate reading achievement (Crocker & Murray Ward, 2001, p. 1372).

Clinical/Diagnostic Usefulness: Based on the critiques available, I think that this test has limited clinical utility and if used, should only be used as an adjunct to more rigorous and contemporary reading tests. I would be very cautious about using this tests results to make important decisions about eligibility and intervention though the author intends for the test to be used in this way. This test has a long history. It is probably widely used and likely to be well embedded in structures of assessment protocols and funding structures. I wonder if educators would give much thought to its selection, or the consideration of other more recent tests as alternatives. Several generations of teachers would be familiar with it. A quote from the Pearson website: I have used the Woodcock Reading Mastery Tests for almost 30 years now . . . I believe it has great value in diagnosing reading difficulties and providing a basis for me to write a prescription for remedying reading difficulties (Dr. Dianne M. Haneke, Professor of Literacy Education, retired) (from the Pearson website).

References

Cooter, R. B., & Jaeger, R. M. (1989). Test review of the Woodcock Reading Mastery Test Revised. From J. C. Conoley & J. J. Kramer (Eds.), The tenth mental measurements yearbook (pp. 909-916). (pp. 1369-1373). Lincoln, NE: Buros Institute of Mental Measurements.

Crocker, L., and Murray Ward, M. (2001). Test review of Woodcock Reading Mastery Test-Revised 1998 Normative Update. In B.S. Plake and J.C. Impara (Eds.), The fourteenth mental measurements yearbook (pp. 1369-1373). Lincoln, NE: Buros Institute of Mental Measurements.

Current Population Survey, March, 1994 [Machine readable data file]. (1994). Washington, DC: Bureau of the Census (Producer and Distributor).

Pearson Assessments (2007). Speech and language forum. Retrieved May 31, 2008 from http://www.SpeechandLanguage.comWoodcock, R. W. (1998). Woodcock reading mastery tests Revised NU: Examiners manual. Circle Pines, MN: American Guidance Service.

To cite this document:

Hayward, D. V., Stewart, G. E., Phillips, L. M., Norris, S. P., & Lovell, M. A. (2008). Test review: Woodcock reading mastery tests-revised (NU normative update) (WRMT-R). Language, Phonological Awareness, and Reading Test Directory (pp. 1-8). Edmonton, AB: Canadian Centre for Research on Literacy. Retrieved [insert date] from http://www.uofaweb.ualberta.ca/elementaryed/ccrl.cfm. PAGE

Johnson Reading Mastery Tests-Revised (WRMT-R).doc

Documents

Transcript of Johnson Reading Mastery Tests-Revised (WRMT-R).doc