Bayley Review

12
http://jpa.sagepub.com Assessment Journal of Psychoeducational DOI: 10.1177/0734282906297199 2007; 25; 180 Journal of Psychoeducational Assessment Craig A. Albers and Adam J. Grieve Third Edition. San Antonio, TX: Harcourt Assessment Test Review: Bayley, N. (2006). Bayley Scales of Infant and Toddler Development– http://jpa.sagepub.com The online version of this article can be found at: Published by: http://www.sagepublications.com can be found at: Journal of Psychoeducational Assessment Additional services and information for http://jpa.sagepub.com/cgi/alerts Email Alerts: http://jpa.sagepub.com/subscriptions Subscriptions: http://www.sagepub.com/journalsReprints.nav Reprints: http://www.sagepub.com/journalsPermissions.nav Permissions: by Carmen Costea on October 2, 2008 http://jpa.sagepub.com Downloaded from

Transcript of Bayley Review

Page 1: Bayley Review

http://jpa.sagepub.com

Assessment Journal of Psychoeducational

DOI: 10.1177/0734282906297199 2007; 25; 180 Journal of Psychoeducational Assessment

Craig A. Albers and Adam J. Grieve Third Edition. San Antonio, TX: Harcourt Assessment

Test Review: Bayley, N. (2006). Bayley Scales of Infant and Toddler Development–

http://jpa.sagepub.com The online version of this article can be found at:

Published by:

http://www.sagepublications.com

can be found at:Journal of Psychoeducational Assessment Additional services and information for

http://jpa.sagepub.com/cgi/alerts Email Alerts:

http://jpa.sagepub.com/subscriptions Subscriptions:

http://www.sagepub.com/journalsReprints.navReprints:

http://www.sagepub.com/journalsPermissions.navPermissions:

by Carmen Costea on October 2, 2008 http://jpa.sagepub.comDownloaded from

Page 2: Bayley Review

180

Test Reviews

Bayley, N. (2006). Bayley Scales of Infant and Toddler Development–Third Edition. San Antonio, TX: Harcourt Assessment.DOI: 10.1177/0734282906297199

The Bayley Scales of Infant and Toddler Development–Third Edition (Bayley-III) is a revi-sion of the frequently used and well-known Bayley Scales of Infant Development–SecondEdition (BSID-II; Bayley, 1993). Like its prior editions, the Bayley-III is an individuallyadministered instrument designed to measure the developmental functioning of infants andtoddlers. Other specific purposes of the Bayley-III are to identify possible developmentaldelay, inform professionals about specific areas of strength or weakness when planning acomprehensive intervention, and provide a method of monitoring a child’s developmentalprogress. The Bayley-III is appropriate for administration to children between the ages of 1month and 42 months (although norms extend downward to age 16 days). The revision of theBayley was specifically driven by eight goals: (a) update the normative data, (b) develop addi-tional scales to fulfill requirements by federal (i.e., the Individuals with Disabilities EducationImprovement Act of 2004) and state laws regarding the five major areas of development forearly childhood assessment from birth through 3 years of age, (c) strengthen the instrument’spsychometric properties, (d) improve the treatment utility of the instrument, (e) simplifyadministration procedures, (f) update item administration, (g) update administration materi-als, and (h) maintain the qualities of previous Bayley editions (Bayley, 2006b).

Description of the Bayley-III

Scales

The most significant revision to the Bayley-III is the development of five distinct scales(as compared to three scales in the BSID-II) to be consistent with areas of appropriatedevelopmental assessment for children from birth to age 3. Whereas the BSID-II providedMental, Motor, and Behavior scales, the Bayley-III revision includes Cognitive, Language,Motor, Social-Emotional, and Adaptive Behavior scales.

Cognitive. The Cognitive scale of the Bayley-III contains 72 out of the 178 items thatwere previously included in the Mental scale of the BSID-II. Additionally, 19 new itemswere added to the Cognitive scale, resulting in a total of 91 items. Forty-five of the itemsfrom the Mental scale were completely removed from the Bayley-III, whereas the remain-ing items either remained the same or were slightly modified and moved to a different scale(i.e., 27 items were moved to the Expressive Communication subtest of the Language scale,23 items were moved to the Fine Motor subtest of the Motor scale, and 11 items weremoved to the Receptive Communication subtest of the Language scale).

Language. Recognizing the significance of assessing a child’s language development, theBayley-III added a Language scale consisting of Receptive and Expressive Communication

Journal of Psychoeducational Assessment

Volume 25 Number 2June 2007 180-198

© 2007 Sage Publicationshttp://jpa.sagepub.com

hosted athttp://online.sagepub.com

by Carmen Costea on October 2, 2008 http://jpa.sagepub.comDownloaded from

Page 3: Bayley Review

Test Reviews 181

subtests. Items in the Receptive Communication subtest are designed to provide informationregarding the child’s auditory acuity and ability to understand and respond to verbal stimuli.This subtest includes 11 items (some slightly modified) from the BSID-II Mental scale and anadditional 38 new items. The Expressive Communication subtest assesses the individual’s abil-ity to vocalize, name pictures and objects, and communicate with others. This subtest contains27 items from the BSID-II (some slightly modified) and 21 new items.

Motor. The Bayley-III Motor scale, consisting of Fine Motor and Gross Motor subtests,is similar to the Motor scale of the BSID-II. The Fine Motor subtest contains 66 items (18items are new) and is purported to measure skills associated with eye movements, percep-tual-motor integration, motor planning, and motor speed. The Gross Motor subtest contains72 items (4 items are new) and is designed to measure movements of the limbs and torso.

Social-Emotional. The Behavior Rating scale in the BSID-II was replaced by theGreenspan Social-Emotional Growth Chart: A Screening Questionnaire for Infants andYoung Children (Greenspan, 2004) and is intended to be completed by the child’s primarycaregiver. For each of the 35 items, which measure emotional development and relatedbehaviors, the respondent selects one of six ratings: 0 (can’t tell), 1 (none of the time), 2(some of the time), 3 (half of the time), 4 (most of the time), or 5 (all of the time).

Adaptive Behavior. A significant addition to the Bayley-III is the inclusion of theAdaptive Behavior Assessment System–Second Edition (ABAS-II; Harrison & Oakland,2003; see Burns, Meikamp, & Suppa, 2005, and Rust & Wallace, 2004, for reviews of theABAS-II) as a measure of adaptive skills. By having the child’s primary caregiver completethe ABAS-II, estimates of the child’s functioning in the areas of Communication,Community Use, Health and Safety, Leisure, Self-Care, Self-Direction, Functional Pre-Academics, Home Living, Social, and Motor can be obtained. (Children younger than 1year do not receive scores in the areas of Community Use, Functional Pre-Academics, orHome Living.) Within the ABAS-II, caregivers indicate the extent to which the child per-forms the adaptive skills when needed. Response options include 0 (is not able), 1 (neverwhen needed), 2 (sometimes when needed), or 3 (always when needed). The inclusion ofthe ABAS-II facilitates a more comprehensive assessment as caregivers are more involvedin completing the ABAS-II than they would be in completing the BSID-II.

Materials

Whereas many of the Bayley-III stimulus materials will look familiar to a user of theBSID-II, the current edition contains additional items such as a bank, a bear, a bracelet, a con-necting block set, a lacing card, memory cards, a set of seven ducks, and a wider steppingpath. Some notable items not contained in this edition include the map, sugar pellets, jump-ing rope, pull toy, and separate visual stimulus cards. Examiners must provide more materi-als than were required for a BSID-II administration, including facial tissue, five small coins,food pellets, several blank 3 × 5 in. index cards, safety scissors, and blank unlined whitepaper. The stimulus book of this edition is more user-friendly in that it contains a built-in

by Carmen Costea on October 2, 2008 http://jpa.sagepub.comDownloaded from

Page 4: Bayley Review

182 Journal of Psychoeducational Assessment

easel that folds low to the table, which aids in assessing an examinee’s responses and allowsfor ease in switching between tasks. A wider stepping path is included and is considered animprovement over the previous edition, providing a more developmentally appropriateguide for assessing gross motor skills. As with the BSID-II, the stimulus materials of thisedition are bright, colorful, and engaging for infants and toddlers. The test kit is less bulkyand more portable than its predecessor, with all stimulus items and manuals fitting into asuitcase with wheels and a pull handle. However, the kit does not contain the plastic divid-ing sections for the stimulus materials, and some examiners may find it more difficult tonavigate through the materials efficiently.

The Bayley-III includes the option of using a Windows-based scoring software and aPDA administration product. This allows an examiner to administer and score the Bayley-III with an electronic handheld device and eliminates the need for a record form andmanual for administration. Although this option may increase efficiency and decrease bulk-iness, examiners should be very familiar with the operation of the software so that stan-dardized administration is not violated.

Administration and Scoring Procedures

Examiners who administer the Bayley-III should be familiar with and have training indevelopmental assessment and interpretation. The age range for which the measure is designedrequires that the examiner have the ability to establish and maintain rapport with infants, tod-dlers, and caregivers. Because of these factors, examiners should have completed relevantgraduate training or professional experiences that include formal individual assessment prepa-ration and supervision so that the measures can be administered consistent with the Standardsfor Educational and Psychological Testing (American Educational Research Association,American Psychological Association, & National Council on Measurement in Education,1999). To gain an accurate impression of an infant or toddler’s optimal performance and toavoid negative behavioral reactions to separation, a caregiver (generally a parent) is encour-aged to remain in the testing room for the duration of the Bayley-III administration. However,caregivers should not encourage, influence, or interfere with item administration to the pointthat standardization procedures are violated.

Administration times range from approximately 50 min for children aged 12 months andyounger to 90 min for children aged 13 months and older. Consistent with the BSID-II, theexaminee’s chronological age (adjusted for prematurity if necessary) corresponds to a start-ing point, designated by a letter A through Q. This letter should be used to determine the start-ing item for the Cognitive, Language, and Motor scales. Each scale has an identicalrequirement for establishing basal and ceiling levels: The first three items administered mustbe correct (examinees receive credit for unadministered items below the basal), and scoringof the scale should discontinue when the examinee receives no credit on five consecutiveitems. In the event that a basal is not established with the first three items administered, theexaminer must reverse to the previous starting point and continue administration until theceiling criterion is met. Although it may be necessary to reverse to an earlier starting point onone scale, the examiner should use the original age-determined starting letter to determine thestarting point for subsequent scales. No items should be readministered during the course of

by Carmen Costea on October 2, 2008 http://jpa.sagepub.comDownloaded from

Page 5: Bayley Review

Test Reviews 183

a testing session; however, if a correct response was not initially elicited but is observedlater in a session (e.g., during the Gross Motor subtest or some items on the ExpressiveCommunication subtest), some items may then be scored as correct. Scoring for every itemis either 1 (credit) or 0 (no credit). The item scoring in this edition is more straightforwardand manageable than in the previous edition, and it allows for a more efficient method ofcalculating a total raw score. Additionally, examiners should be aware of specific behaviorsthat are indicative of delayed or atypical development (referred to as developmental riskindicators) within the areas of social behavior, attention, motor and movement, hearing,and vision. Explained in detail in the technical manual (Bayley, 2006b), these indicatorssuggest the need for additional assessment.

The examiner’s record form contains items for the Cognitive, Language, and Motor scales,with a separate questionnaire that contains items for both the Social-Emotional and AdaptiveBehavior scales. This questionnaire is to be completed by the primary caregiver. The exam-iner’s record form provides item titles, materials needed for each item, scoring criteria, andspace for noting additional comments about an examinee’s responses. Similar to the BSID-II,some items are part of a series that use the same materials, and the examinee can demonstratevarying levels of proficiency. For example, for the pegboard series, Item 47 requires the childto place at least one peg two or more times in the same or different hole or holes. The peg-board should also then be used to administer Item 55, in which the child receives credit forplacing all six pegs in the pegboard within 70s. Examiners should score such series items con-currently, so that it is not necessary to switch away from and back to a stimulus material.However, a child may pass a series item that falls beyond an established ceiling. In this case,the series item should not be included in the total raw score, but it should be qualitativelynoted. Series items are specified as such, and the additional items contained in the series arespecified on the far left side of each page of the record form. The record form is quite color-ful, with each color corresponding to a specific scale. The colors not only are aestheticallypleasing but also function to separate the Cognitive, Language, and Motor scales for theexaminer who may need to switch between scales while administering the complete test. Itemmaterials and scoring criteria are specified in the record form, but examiners should closelyreference the administration manual (Bayley, 2006a). The administration manual providesclear guidelines for item instructions, stimulus layout, child positioning, and so on. The for-mat for referencing the administration manual in this edition is similar to that in the BSID-II,but pictures of the necessary item materials are not provided with the corresponding items inthe administration section of the manual. Therefore, examiners must be able to differentiatethe proper stimuli without the aid of a picture included with the administration procedures.The manual does, however, provide a page with pictures of the test items and their corre-sponding names, so that an examiner can familiarize himself or herself with the names of thestimulus materials prior to administering the test. One notable improvement in the Bayley-IIIadministration manual is that it is ring bound, which allows examiners to remain on a desiredpage without concern that the manual might accidentally close. Although the length of theadministration section may seem daunting, the instructions to the examiner are necessary forproper administration of this measure, and the apparent complexity reduces with increasedfamiliarity with and experience in administering the Bayley-III to infants and toddlers of var-ious ages.

by Carmen Costea on October 2, 2008 http://jpa.sagepub.comDownloaded from

Page 6: Bayley Review

A variety of scores across scales and subtests are available. Raw scores from theCognitive scale, which does not contain separate subtests, can be converted to a scaledscore (M = 10, SD = 3), which can then additionally be converted to a composite scoreequivalent (M = 100, SD = 15). Scaled scores are available for the Receptive and ExpressiveCommunication subtests of the Language scale, which when combined form the Languagescale composite score (M = 100, SD = 15). The same procedure holds for the Fine andGross Motor subtests of the Motor scale. Across all three of these primary domains, thenormative sample is divided into 10-day increments (e.g., 2 months 6 days through 2months 15 days). Raw scores for Cognitive, Language, and Motor subtests translate toscaled scores based on 10-day increments up to age 5 months 16 days, at which point normsare based on 1-month intervals (e.g., 5 months 16 days to 6 months 15 days, 35 months 16days to 36 months 15 days). The highest two age ranges are normed on the basis of 3-monthintervals (36 months 16 days to 39 months 15 days). Thus, depending on the age of thechild, normative scaled scores are derived on the basis of 10-day, 1-month, or 3-monthintervals. Percentile ranks, confidence intervals (90% and 95% levels), growth score equiv-alents, and developmental age scores in months and days are available.

The scoring for the Social-Emotional Scale is straightforward. The summed raw score isconverted to a scaled score (M = 10, SD = 3), which can additionally be converted to a com-posite score equivalent (M = 100, SD = 15). The normative sample for the Social-Emotionaldomain is divided into nine age categories (by months: 0-3, 4-5, 6-9, 10-14, 15-18, 19-24, 25-30, and 31-40). A Sensory Processing score can also be calculated. The administration manualprovides additional guidance regarding conducting supplemental analyses within this scale.

The Adaptive Behavior scale follows the scoring criteria of the ABAS-II. Raw scores foreach of the 10 skill areas are converted to scaled scores (M = 10, SD = 3). From these scaledscores, a General Adaptive Composite (GAC) score (M = 100, SD = 15) can be obtained.Additional composite scores are available for a Conceptual Adaptive domain (Communi-cation, Functional Pre-Academics, and Self-Direction skill areas), Social Adaptive domain(Leisure and Social skill areas), and Practical Adaptive domain (Community Use, HomeLiving, Health and Safety, and Self-Care skill areas). The normative sample for the AdaptiveBehavior scale is in 1-month increments for children aged 11 months and younger, 2-monthincrements for children aged 13 months to 23 months, and 3-month increments for childrenaged 24 months to 42 months. Percentile ranks and confidence intervals (90% and 95% lev-els) are available to assist in interpretation.

Technical Adequacy

Test Construction

Bayley (2006b) describes the construction of the Bayley-III as being informed by thebody of research in child development conducted since the publication of the BSID-II in1993. However, many concepts of early cognition have been retained in child developmentand are still very much applicable to the current revision; these include play (e.g., Bruner,1972; Piaget, 1952; Singer, 1973; Vygotsky, 1978), information processing (e.g., Bornstein& Sigman, 1986; Fagan, 1970), and number concepts (e.g., Gelman & Tucker, 1975; Wynn,

184 Journal of Psychoeducational Assessment

by Carmen Costea on October 2, 2008 http://jpa.sagepub.comDownloaded from

Page 7: Bayley Review

1990). Consequently, the Bayley-III retains such activities as pretend play, novelty prefer-ence, habituation, and number ordering. Given that these older concepts are combined withconcepts derived from relatively recent studies of information processing and preverbalintelligence (e.g., Colombo & Frick, 1999; Dougherty & Haith, 1997; Kail, 2000; Schatz,Kramer, Ablin, & Matthay, 2000), it is clear that the Bayley-III is based on an eclectic the-oretical foundation.

As indicated earlier, a significant number of items from the BSID-II were either removedor modified for the current version. Bayley’s (2006b) justification for these changesincludes a desire to remove items that were (a) difficult to administer or score, (b) unpleas-ant for the child, (c) redundant with other items, (d) potentially biased toward a racial orethnic group, and (e) lacking in value. To create new and suitable items, a comprehensiveprocess was undertaken that included development and feedback by content experts toensure appropriateness of the items, tryout phases, and numerous input points from expertsin their respective areas. This process, which is outlined in the technical manual, included(a) a conceptual development phase, which included reviews by an advisory panel, mea-surement consultants, and international experts, followed by focus groups and surveys ofexperts and examiners; (b) a pilot phase; (c) a national tryout phase; (d) a minipilot; (e) astandardization phase; and (f) an assembly and evaluation phase.

Considering that the primary intent of the Bayley-III is to identify children experiencingdevelopmental delay and not to specifically diagnose a disorder, the floor and ceiling of thesubtest and total test appear to be adequate. As would be expected from an adaptive behav-ior measure (i.e., ABAS-II) that was developed independently of the Bayley-III, the floorfor the Adaptive Behavior scale extends downward to a composite score of 40 (extendingupwards to a score of 160), whereas the remaining Bayley-III floor composite scores arerelatively higher (Cognitive, 55-145; Language, 47-153; Motor, 46-154; Social-Emotional,55-145). One area that was not improved, however, are the subtest floor scores for theyoungest children in the sample (i.e., those aged 16 to 25 days). Bell and Allen (2000) indi-cated that in the BSID-II, a 1-month-old child would need to receive only one raw scorepoint to earn a standard score of 60 on the Mental scale. This same child would have earneda standard score of 65 on the Bayley-III Cognitive scale.

Standardization Sample

The standardization sample for the Cognitive, Language, and Motor scales included1,700 children aged 1 month to 42 months, divided into 17 separate age groups, with 100individuals in each group. This sample was reported to be representative of the October2000 U.S. Bureau of the Census population survey data in terms of parent education level,race or ethnicity, and geographic region. Only children who were born at 36 to 42 weeksgestation and who were considered to be typically developing were included in the stan-dardization sample, although children with mental, physical, or behavioral difficulties werelater added to constitute approximately 10% of the total sample. The standardization sam-ple for the Social-Emotional scale was collected during an earlier tryout phase of theBayley-III and included 456 children. Although a relatively small sample, it appears to besufficiently representative of the U.S. population. Finally, the standardization of the

Test Reviews 185

by Carmen Costea on October 2, 2008 http://jpa.sagepub.comDownloaded from

Page 8: Bayley Review

Adaptive Behavior scale (i.e., ABAS-II) occurred during the development and standardizationof the ABAS-II, independently of the Bayley-III standardization process. According to infor-mation included in the Bayley-III technical manual, the standardization sample of the ABAS-IIincluded 1,350 children aged 0 months through 71 months. To account for the extended agerange, norms were truncated to reflect the 42-month age limit of the Bayley-III.

Reliability

Cognitive, Language, and Motor. Evidence for internal consistency reliability for theCognitive, Language, and Motor composites and subtest scales was obtained on the nor-mative sample using the split-half method corrected by the Spearman-Brown formula cor-rection. It was not specified how test halves were divided. The average reliabilitycoefficients were calculated using Fisher’s z transformation. Scale composite average reli-ability coefficients ranged from .91 (Cognitive) to .93 (Language), whereas subtest averagereliability coefficients ranged from .86 (Fine Motor subtest) to .91 (ExpressiveCommunication and Gross Motor subtests). Within specific subtests, the lowest reliabilitycoefficients (e.g., .71) were obtained in the younger age groups (e.g., 1-5 months) withinthe Receptive and Expressive Communication subtests. The average reliability coefficientsfor the special groups included in the sample were all greater than .94. Test-retest stabilitywas determined by readministering the Bayley-III to 197 children, who were tested on twooccasions separated by anywhere from 2 to 15 days, with a mean retest interval of 6 days.Corrected correlation coefficients ranged from .67 (Fine Motor subtest) to .80 (ExpressiveCommunication subtest) with the group aged 2 to 4 months and from .83 (Gross Motor sub-test) to .94 (Expressive Communication subtest and Language composite) for the groupaged 33 to 42 months. Across all ages, average stability coefficients were .80 or higher.

Social-Emotional. Reliability indices from the Greenspan Social-Emotional GrowthChart standardization process are included in the Bayley-III technical manual. Internal con-sistency was estimated using coefficient alpha, with coefficients ranging from .83 to .94 forsocial-emotional items and .76 to .91 for the sensory processing items. No stability or inter-rater reliability indices for the Greenspan Social-Emotional Growth Chart were provided.

Adaptive Behavior. Evidence for internal consistency reliability for the AdaptiveBehavior scale was obtained during the ABAS-II standardization process. Internal consis-tency was estimated using coefficient alpha, with average reliability coefficients being cal-culated using Fisher’s z transformation. Average reliability coefficients across each of theskill areas, adaptive domains, and the GAC ranged from .79 to .98. Test-retest stability wasestimated using a sample of 207 children, with intervals ranging from 2 days to 5 weeks(M = 12 days). The mean stability coefficients for the GAC and Adaptive Behavior domainsgenerally were .80 or higher, whereas coefficients were slightly lower for specific skillareas. Overall, stability increased as the age of the child increased. Interrater reliability wasestimated with a sample of 56 children who were each rated by their two parents. The GACinterrater reliability coefficient was .82, adaptive domain coefficients averaged .79, and theadaptive skill area coefficients averaged .73.

186 Journal of Psychoeducational Assessment

by Carmen Costea on October 2, 2008 http://jpa.sagepub.comDownloaded from

Page 9: Bayley Review

Validity

Confirmatory factor analysis of the subtests of the Cognitive, Language, and Motorscales supported a three-factor model across all ages of the 1,700-child standardizationsample, except for the youngest age group (0-6 months), in which a two-factor model wasalso supported. The technical manual suggests that the applicability of both a two-factorand a three-factor model to the youngest group is likely an indicator that language and cog-nition are undifferentiated at that age. No factor analysis data regarding the Social-Emotional or Adaptive Behavior scales are provided in the Bayley-III technical manual.

The technical manual also describes a series of validity-related studies conducted withother cognitive, intellectual, language, motor, social-emotional, and adaptive behavior mea-sures. The correlation between the Bayley-III Cognitive composite and BSID-II MentalIndex score was .60, which was also the correlation between the Motor composite scoreson both measures. The correlation between the BSID-II Behavior Rating Scale and theBayley-III Social-Emotional composite was only .38, which was attributed to the new for-mat and items that were added. The relatively moderate correlations between other com-posite and subtest scores were attributed to new scoring criteria, clarifications of previousscoring ambiguities, and changes to the floor and ceiling.

Relatively high correlations were obtained between the Wechsler Preschool and PrimaryScale of Intelligence–Third Edition (Wechsler, 2002) Verbal, Performance, and Full-Scalescores and the Bayley-III Cognitive (.72-.79) and Language composites (.71-.83). ThePreschool Language Scale–Fourth Edition (Zimmerman, Steiner, & Pond, 2002) AuditoryComprehension and Expressive Communication subscales were moderately correlated withthe Bayley-III Language composite (.51-.71). Moderate correlations were obtained betweenthe Bayley-III Motor composite and the Peabody Developmental Motor Skills–SecondEdition (Folio & Fewell, 2000) Motor quotients (.49-.57) and between the ABAS-II and theVineland Adaptive Behavior Scale–Interview Edition (Sparrow, Balla, & Cicchetti, 1984)domain scores and composite score (.58-.70). The technical manual also details numerousspecial group studies, including studies examining the Bayley-III with children with Downsyndrome, pervasive developmental disorders, cerebral palsy, specific language impairment,developmental delay, asphyxiation at birth, and prenatal alcohol exposure; children small forgestational age; and children born premature or with low birth weight.

Commentary and Recommendations

As part of a comprehensive evaluation, the Bayley-III appears to continue setting the stan-dard for early childhood assessment, as the majority of the stated goals of the revision processappear to have been attained. The first goal was to update the normative data, which wasaccomplished with a representative standardization sample of 1,700 students. The secondgoal was to develop additional scales so that the areas of cognitive, communication, physical,social-emotional, and adaptive behavior development are examined. In this regard, theBayley-III contains some necessary and needed improvements over the BSID-II. Specifically,the inclusion of a separate Language scale provides helpful information to a professional inassessing the development of a child. Although it does not completely remove the demand

Test Reviews 187

by Carmen Costea on October 2, 2008 http://jpa.sagepub.comDownloaded from

Page 10: Bayley Review

for language in the Cognitive scale, this edition does a commendable job of separating cog-nitive ability from expressive and expressive language development. The addition of theABAS-II enhances the quality of information provided by the Bayley-III and also increasesthe role of the primary caregiver in the assessment process. Finally, specification of fine andgross motor assessment can be more useful to a professional in formulating interventionstrategies than was the inclusive Motor scale of the BSID-II.

The third goal was to strengthen the instrument’s psychometric properties. Although thereliability of scores could be improved for younger children (i.e., those aged 0 to 6 months),this is typical of all instruments intended to be used with a population in which develop-mental scores tend to be highly variable. Even with this relative weakness, all of the psy-chometric properties meet minimal criteria, with the majority of scores being strong.Questions remain, however, regarding floor appropriateness, particularly for lower per-forming and extremely young children. This presents difficulties if classification based onspecific cutoff scores is the ultimate outcome of the child’s performance on the Bayley-III;however, if the instrument is being used as one component of a multifaceted evaluation oras an indicator to determine whether additional evaluation is warranted, the potential floorinadequacy is not as problematic.

Whether the treatment utility of the Bayley-III is enhanced, which is the fourth goal, hasyet to be determined. Within an early intervention model, the Bayley-III would appear tohave utility for identifying individuals in need of additional assessment and, likely, inter-vention; however, no evidence is presented to show predictive validity and accuracy or howintervention provision is improved as a result of a Bayley-III administration. This is clearlyan area in need of additional research, relating not only to the Bayley-III but also across allareas connected to assessment (e.g., Nelson-Gray, 2003).

The next three goals related to the simplification of administration procedures, updateditem administration, and updated administration materials; all appear to have been met asthe test materials are as engaging as, if not slightly more appealing than, the materials inthe BSID-II, with some more developmentally appropriate items (i.e., wider walking tape,different-sized balls) and more realistic picture stimuli. Whereas the administration andtechnical materials were in one manual and described as a weakness in the BSID-II (e.g.,Nellis & Gridley, 1994), the Bayley-III divides these into two separate manuals. The testkit is no longer a hindrance to transport, and scoring procedures (i.e., 1 vs. 0) are muchmore user-friendly and less ambiguous than in the BSID-II. Although materials are not asclearly divided within the kit and pictures are not provided alongside administration pro-cedures in the manual, the overall construction and usability of the Bayley-III is animprovement over the previous edition.

The final goal was to maintain the qualities of previous Bayley editions. By maintaininga variety of age-appropriate tasks, establishing the psychometric properties of the revision,and incorporating new developmental realms that were in need of improvement in prior edi-tions, the Bayley-III will likely maintain its status as the most frequently used individuallyadministered measure of infant and toddler development.

Craig A. AlbersAdam J. Grieve

University of Wisconsin–Madison

188 Journal of Psychoeducational Assessment

by Carmen Costea on October 2, 2008 http://jpa.sagepub.comDownloaded from

Page 11: Bayley Review

Test Reviews 189

References

American Educational Research Association, American Psychological Association, & National Council onMeasurement in Education. (1999). Standards for educational and psychological testing. Washington, DC:Authors.

Bayley, N. (1993). Bayley Scales of Infant Development–Second Edition. San Antonio, TX: The PsychologicalCorporation.

Bayley, N. (2006a). Bayley Scales of Infant and Toddler Development–Third Edition: Administration manual.San Antonio, TX: Harcourt Assessment.

Bayley, N. (2006b). Bayley Scales of Infant and Toddler Development–Third Edition: Technical manual. SanAntonio, TX: Harcourt Assessment.

Bell, S., & Allen, B. (2000). Test review of the Bayley Scales of Infant Development, Second Edition. Journalof Psychoeducational Assessment, 18, 185-195.

Bornstein, M. H., & Sigman, M. D. (1986). Continuity in mental development from infancy. ChildDevelopment, 57, 251-274.

Bruner, J. S. (1972). Nature and uses of immaturity. American Psychologist, 27(8), 687-708.Burns, M. K., Meikamp, J., & Suppa, C. H. (2005). Review of the Adaptive Behavior Assessment System:

Second edition. In R. A. Spies & B. S. Plake (Eds.), The sixteenth mental measurements yearbook. Lincoln:University of Nebraska.

Burns, M. K., Meikamp, J., & Suppa, C. H. (2005). Review of the Adaptive Behavior Assessment System:Second edition [Electronic version]. Available from the Buros Institute of Mental Measurements’ Web site:http://www.unl.edu/buros (Original work published in R. A. Spies & B. S. Plake, Eds., The sixteenth men-tal measurements yearbook, 2005).

Colombo, J., & Frick, J. (1999). Recent advances and issues in the study of preverbal intelligence. InM. Anderson (Ed.), The development of intelligence (pp. 43-71). Hove, UK: Psychology Press.

Dougherty, T. M., & Haith, M. M. (1997). Infant expectations and reaction times as predictors of childhoodspeed of processing and IQ. Developmental Psychology, 33(1), 146-155.

Fagan, J. F. (1970). Memory in the infant. Journal of Experimental Child Psychology, 9, 217-226.Folio, R. M., & Fewell, R. R. (2000). Peabody Developmental Motor Scales–Second Edition. Austin, TX: Pro-Ed.Gelman, R., & Tucker, M. F. (1975). Further investigations of the young child’s conception of number. Child

Development, 46, 167-175.Greenspan, S. I. (2004). Greenspan Social-Emotional Growth Chart: A screening questionnaire for infants and

young children. San Antonio, TX: Harcourt Assessment.Harrison, P. L., & Oakland, T. (2003). Adaptive Behavior Assessment System–Second Edition. San Antonio, TX:

The Psychological Corporation.Kail, R. (2000). Speed of information processing: Developmental change and links to intelligence. Journal of

School Psychology, 38(1), 51-61.Nellis, L., & Gridley, B. E. (1994). Review of the Bayley Scales of Infant Development–Second Edition.

Journal of School Psychology, 32, 201-209.Nelson-Gray, R. O. (2003). Treatment utility of psychological assessment. Psychological Assessment, 15, 521-531.Piaget, J. (1952). The origins of intelligence in children. New York: International Universities Press.Rust, J. O., & Wallace, M. A. (2004). Test review of the Adaptive Behavior Assessment System–Second Edition.

Journal of Psychoeducational Assessment, 22, 367-373.Schatz, J., Kramer, J. H., Ablin, A., & Matthay, K. K. (2000). Processing speed, working memory and IQ: A devel-

opmental model of cognitive deficits following cranial radiation therapy. Neuropsychology, 14(2), 189-200.Singer, J. L. (1973). The child’s world of make-believe: Experimental studies of imaginative play. New York:

Academic Press.Sparrow, S. S., Balla, D. A., & Cicchetti, D. V. (1984). Vineland Adaptive Behavior Scale–Interview edition.

Circle Pines, MN: American Guidance Service.Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Cambridge, MA:

Harvard University Press.

by Carmen Costea on October 2, 2008 http://jpa.sagepub.comDownloaded from

Page 12: Bayley Review

Wechsler, D. (2002). Wechsler Preschool and Primary Scale of Intelligence–Third Edition. San Antonio, TX:The Psychological Corporation.

Wynn, K. (1990). Children’s understanding of counting. Cognition, 36, 155-193.Zimmerman, I. L., Steiner, V. G., & Pond, R. E. (2002). Preschool Language Scale–Fourth Edition. San

Antonio, TX: The Psychological Corporation.

Swerdlik, M. E., Swerdlik, P., Kahn, J. H., & Thomas, T. (2003). PsychologicalProcessing Checklist. North Tonawanda, NY: Multi-Health Systems.DOI: 10.1177/0734282906295403

The Psychological Processing Checklist (PPC) is a teacher-completed rating scale pub-lished by Multi-Health Systems in North Tonawanda, New York. The checklist was publishedin 2003 along with a technical manual (Swerdlik, Swerdlik, & Kahn, 2003). The 35-item PPCpurports to measure difficulties with psychological processing among children in kinder-garten through fifth grade. More specifically, the scale represents the authors’ attempt toprovide a norm-referenced measure of behaviors associated with psychological processingdeficits that is consistent with the Individuals with Disabilities Education Act Amendments of1997 definition of a learning disability.

Thus, the scale attempts to provide information useful in distinguishing learning disabili-ties from other conditions and to provide ideas relevant to the design of interventions andmodifications to address difficulties associated with processing and learning. PPC items arebased on information processing theory and neuropsychological theories, which propose thatlearning difficulties and disabilities are associated with deficits in psychological processing.

The PPC provides scores that are based on the teacher’s observations of the student’sprocessing abilities. The PPC can be completed by the student’s general education teacher,special education teacher, and other qualified professionals (e.g., reading specialist, speech-language pathologist, educational diagnostician) who spend time with the student on aregular basis. The authors encourage the use of multiple raters to gather more than one per-spective. The authors also suggest that appropriate raters are those professionals who haveknown the student for at least 6 weeks and who have had sufficient opportunities to observethe student in the classroom. As is typical with normative rating scales, greater familiaritywith the student will likely result in more meaningful and reliable ratings.

The PPC includes 35 items, each describing a student behavior. The teacher responds tothe items by choosing one of four labels (never, seldom, sometimes, often) based on the fre-quency with which the teacher observes the behavior. The administration and scoring formatis similar to that of the familiar Conners’ Rating Scales–Revised (Conners, 1997). Raterswrite directly on the form, and then the examiner opens the carbon copy form to reveal thescoring grid, onto which the examiner transfers numeric responses to the different columnsrepresenting the PPC scales. The numbers in each column are then summed to obtain rawscores for each of the six scales, and these six raw scores are summed to obtain the total score.Raw scores are transferred to a profile form, which provides the corresponding T-scores(mean of 50, standard deviation of 10) and percentiles for the six scales and total score.Separate profiles and score conversions are provided for male and female students because of

190 Journal of Psychoeducational Assessment

by Carmen Costea on October 2, 2008 http://jpa.sagepub.comDownloaded from