A Guide to Child Nonverbal IQ Measures · 2005-01-18 · bal IQ measures and by including a...
Transcript of A Guide to Child Nonverbal IQ Measures · 2005-01-18 · bal IQ measures and by including a...
DeThorne & Schaefer: Nonverbal IQ 275American Journal of Speech-Language Pathology • Vol. 13 • 275–290 • November 2004 • © American Speech-Language-Hearing Association
1058-0360/04/1304-0275
Clinical Focus
Laura S. DeThorneBarbara A. SchaeferPennsylvania State University, University Park
A Guide to Child Nonverbal IQ Measures
This guide provides a basic overview of 16child nonverbal IQ measures and uses a set ofspecified criteria to evaluate them in terms oftheir psychometric properties. In doing so, thegoal is neither to validate nor to criticize currentuses of IQ but to (a) familiarize clinicians andinvestigators with the variety of nonverbal IQ
measures currently available, (b) highlight someof the important distinctions among them, and(c) provide recommendations for the selectionand interpretation of nonverbal IQ measures.
Key Words: assessment, cognition, language
The use of IQ scores has become relatively common-place within the clinical and research practices ofspeech-language pathology. From a clinical stand-
point, speech-language pathologists in the public schoolsare often encouraged to consider IQ when making deci-sions regarding treatment eligibility (Casby, 1992;Whitmire, 2000). From a research perspective, IQ measuresare often administered to define the population of interest, asin the case of specific language impairment (Plante, 1998;Stark & Tallal, 1981). One distinction across IQ measuresthat appears particularly relevant to the field of speech-language pathology is whether the measure of interest isconsidered verbal or nonverbal in nature. Nonverbalintelligence tests were designed to measure general cogni-tion without the confound of language ability. The majorimpetus for the development of nonverbal IQ measuresappeared to come from the U.S. military during World WarI. As summarized in McCallum, Bracken, and Wasserman(2001), the military used group-administered IQ measures toplace incoming recruits and consequently needed a means toassess individuals that either were illiterate or demonstratedlimited English proficiency. Once nonverbal assessmenttools were developed, their use expanded to include addi-tional populations, such as individuals with hearing loss,neurological damage, psychiatric conditions, learningdisabilities, and speech-language impairments. Because thepractice of speech-language pathology centers on suchspecial populations, the present article focuses exclusivelyon measures of nonverbal IQ.
The distinction between verbal and nonverbal IQ wasnot originally based on empirically driven theory; rather,Wechsler’s original IQ scales differentiated verbal fromperformance (nonverbal) primarily for practical reasons(McGrew & Flanagan, 1998, p. 25). Subsequent theoristshave applied factor analytic methods to a variety of
cognitive measures to derive empirically based models ofintelligence. For example, Carroll (1993) conducted factoranalyses of 477 data sets to develop his three-stratumtheory of intelligence. Stratum III includes a general factor,often referred to as g, which is thought to contribute invarying degrees to all intellectual activities. The secondstratum contains 8 broad abilities, such as fluid intelligenceor visual perception. These broad abilities are then furtherdivided into 70 narrow cognitive abilities that contribute toStratum I. McGrew and Flanagan (1998) have integrated thethree-stratum theory with the theory of fluid and crystallizedintelligences, associated with Horn and Cattell (1966), tocreate the Cattell–Horn–Carroll theory of cognitive abilities(McCallum, 2003, p. 64). Based on this theory, McGrewand Flanagan (1998) have suggested that verbal IQrepresents the construct of crystallized intelligence but that“nonverbal IQ” is not a valid construct in and of itself:“There is no such thing as ‘nonverbal’ ability—onlyabilities that are expressed nonverbally” (p. 25).
Although questions regarding the nature of intelligenceand how it is best measured are beyond the scope of thisarticle, the Cattell–Horn–Carroll theory of cognitiveabilities (McGrew & Flanagan, 1998) is offered as atheoretical framework within which to speculate about thenature of individual IQ measures. Previous published workhas provided thorough discussion of the psychometricproperties and general issues related to standardizedassessment (e.g., McCauley, 2001; McCauley & Swisher,1984a, 1984b). The present review intends to expand onsuch work by applying the information to specific nonver-bal IQ measures and by including a discussion of howindividual tests may relate to broad cognitive abilities. Thespecific aims of the present article are to (a) familiarizeclinicians and investigators with the variety of nonverbalIQ measures currently available, (b) highlight some of the
276 American Journal of Speech-Language Pathology • Vol. 13 • 275–290 • November 2004
important distinctions among them, and (c) providerecommendations for the selection and interpretation ofnonverbal IQ measures.
OverviewWe undertook a review of all mainstream IQ measures
currently on the market and selected for this review thosethat met five specific criteria. First, each measure had to bemarketed or commonly used as a measure of generalcognitive functioning. Second, each included measure hadto provide a standardized score of nonverbal ability. Bynonverbal, we mean that the tests rely heavily on visuo-spatial skills and do not require examinees to provideverbal responses. Some IQ measures, such as the CognitiveAssessment System (CAS; Naglieri & Das, 1997) and theWoodcock–Johnson Tests of Cognitive Abilities—III(Woodcock, McGrew, & Mather, 2001), include nonverbalcomponents but do not organize them into a composite orscaled score. Such measures were not included given thatinterpretation at the subscale level is generally not adviseddue to psychometric concerns (McDermott & Glutting,1997).
Third, in addition to providing a nonverbal measure ofgeneral cognitive functioning, each included IQ measurewas required to be relatively recent, meaning each wasdeveloped or revised within the last 15 years. One excep-tion was made in the case of the Columbia Mental MaturityScale—Third Edition (CMMS–3; Burgemeister, Hollander,& Lorge, 1972). Although this measure was published over30 years ago, it was included here given its frequent usewithin the field of speech-language pathology. The fourthcriterion for test inclusion was suitability of the measurefor preschool or school-age children. Although a numberof the included measures can be administered well intoadulthood, IQ measures that focused exclusively on adultswere omitted. Fifth, each measure had to be developed forthe population at large as opposed to specific groups, suchas individuals with visual impairment or deafness. Table 1contains a list of 16 IQ measures that met all five specifiedcriteria. The remainder of the article is devoted to (a)elaborating on the test information presented in Table 1,(b) providing an evaluation of each test’s psychometricproperties, and (c) offering recommendations for theselection and interpretation of nonverbal IQ measures.
Test InformationAge Range
Column 1 lists the ages at which normative data werecollected for the test as a whole or for the nonverbalsection specifically if it differed from the whole.
Verbal SubscoreColumn 2 in Table 1 denotes whether each measure
provides a verbal subscore in addition to the nonverbalcomponents. Although the focus of the present article is onthe nonverbal components of each measure, we thought itmight be useful for readers to know whether the same test
provided a verbal subscore as well. Many verbal subscoresare composed of subtests that are analogous to languagemeasures. For example, the Wechsler Intelligence Scale forChildren—Fourth Edition (WISC–IV; Wechsler, 2003)includes a Similarities subtest that requires children toexplain how two items are alike, and the DifferentialAbility Scales (DAS; C. D. Elliott, 1990a) include aNaming Vocabulary subtest that requires children to nameobjects/pictures. To the extent they reflect languageabilities specifically, verbal subscores may help validateand/or identify language concerns in particular children.Verbal subscores can be derived from 5 of the 16 measureslisted in Table 1. Separate from the verbal subscores notedin column 2, a number of measures presented in Table 1offer additional batteries, supplemental subtests, oralternative forms. Such additional components are noted incolumn 6, with the subtest descriptions.
Form of InstructionsColumn 5 in Table 1 refers to the form of instructions,
either verbal or nonverbal, that is used during test adminis-tration. It is not uncommon for measures that are promotedas nonverbal to rely to varying degrees on verbal instruc-tion. In fact, the nonverbal components of almost all testslisted in Table 1 include verbal instructions. For example,even though the child is allowed to respond nonverbally bypointing, administration of the Picture Completion subtestfrom the WISC–IV (Wechsler, 2003) is accompanied bythe instruction, “I am going to show you some pictures. Ineach picture there is a part missing. Look at each picturecarefully and tell me what is missing.” Given the use ofverbal instructions, McCallum et al. (2001) have arguedthat most “nonverbal tests” are better described as “lan-guage-reduced instruments” (p. 8).
Some measures, such as the Kaufman AssessmentBattery for Children (K–ABC; Kaufman & Kaufman,1983) and the Comprehensive Test of Nonverbal Intelli-gence (CTONI; Hammill, Pearson, & Wiederholt, 1997),provide verbal instructions within the administrationprocedures, but note that nonverbal instructions could beused. Such measures are identified by V/NV in column 5of Table 1. In contrast to measures that simply offernonverbal instructions as an administration option, threemeasures in Table 1 were specifically normed withnonverbal instructions: the Leiter International Perfor-mance Scale—Revised (Leiter–R; Roid & Miller, 1997),the Universal Nonverbal Intelligence Test (UNIT; Bracken& McCallum, 1998), and the Test of Nonverbal Intelli-gence—Third Edition (TONI–3; Brown, Sherbenou, &Johnsen, 1997). In such cases, instruction and feedback areprovided nonverbally via gestures, facial expressions,modeling, and so on. Of course, the use of nonverbalinstructions does not guarantee that language ability willnot influence test performance. Verbal problem-solvingstrategies can be used to solve nonverbal problems. Inother words, children can talk themselves through prob-lems that do not require a verbal response. For example, onrecent administration (by the second author) of the UNIT, a7-year-old girl labeled the figures girl, man, baby, boy on
DeThorne & Schaefer: Nonverbal IQ 277
TA
BL
E 1
(P
age
1 o
f 3)
. Su
mm
ary
of
no
nve
rbal
IQ m
easu
res.
Age
ran
geV
erba
lIQ
test
(yea
rs;m
onth
s)su
bsco
reT
imea
Man
ipul
ativ
esb
Inst
ruct
ions
Des
crip
tion
of n
onve
rbal
sub
test
s
C
ostc
Col
umbi
a M
enta
l Mat
urity
3;6–
9;11
No
15–2
0N
oV
Com
plet
e a
serie
s by
sel
ectin
g th
e pi
ctur
e or
figu
re th
at d
oes
not
$725
Sca
le—
Thi
rd E
ditio
nbe
long
with
the
othe
rs(C
MM
S–3
; Bur
gem
eist
er,
Hol
land
er, &
Lor
ge, 1
972)
Com
preh
ensi
ve T
est o
f6;
0–90
;11
No
40–6
0N
oV
/NV
Pic
toria
l Ana
logi
es—
iden
tify
rela
tions
hips
in a
2 x
2 p
ictu
re m
atrix
and
$378
Non
verb
al In
telli
genc
ese
lect
a r
espo
nse
to c
ompl
ete
a si
mila
r se
t of r
elat
ions
hip
pict
ures
(CT
ON
I; H
amm
ill, P
ears
on,
Geo
met
ric A
nalo
gies
—sa
me
as P
icto
rial A
nalo
gies
usi
ng g
eom
etric
& W
iede
rhol
t, 19
97)
desi
gns
Pic
toria
l Cat
egor
ies—
sele
ct 1
out
of 5
pic
ture
figu
re(s
) th
at s
hare
s a
rela
tions
hip
with
2 s
timul
us p
ictu
res
Geo
met
ric C
ateg
orie
s—sa
me
as P
icto
rial C
ateg
orie
s us
ing
geom
etric
desi
gns
Pic
toria
l Seq
uenc
es—
iden
tify
the
rule
in a
pro
gres
sion
of p
ictu
re s
timul
ian
d se
lect
a p
ictu
re to
com
plet
e th
e se
quen
ceG
eom
etric
Seq
uenc
es—
sam
e as
Pic
toria
l Seq
uenc
es u
sing
geo
met
ricde
sign
s
Diff
eren
tial A
bilit
y S
cale
s2;
6–17
;11
Yes
:15
–30
Yes
:V
/NV
Blo
ck B
uild
ing—
repl
icat
e co
nstr
uctio
ns m
ade
by th
e ex
amin
er u
sing
$799
(DA
S; C
. D. E
lliot
t, 19
90a)
Ver
bal
woo
den
woo
den
bloc
ks (
age
2;6–
3;5)
Clu
ster
bloc
ks,
Pic
ture
Sim
ilarit
ies—
plac
e in
divi
dual
pic
ture
car
ds n
ext t
o si
mila
r or
at a
gefo
amre
late
d ite
ms
with
in il
lust
rate
d ar
rays
(ag
e 2;
6–5;
11)
3;6–
17;1
1sq
uare
s,P
atte
rn C
onst
ruct
ion—
(tim
ed, w
ith u
ntim
ed a
ltern
ativ
e) r
epro
duce
plas
ticpi
ctur
ed d
esig
ns u
sing
foam
squ
ares
or
plas
tic c
ubes
(ag
e 3;
6–17
;11)
cube
s,C
opyi
ng—
copy
line
dra
win
gs m
ade
by th
e ex
amin
er o
r sh
own
in a
pict
ure
pict
ure
(age
3;6
–5;1
1)ca
rds,
Mat
rices
—se
lect
pic
ture
d ite
ms
to c
ompl
ete
rule
-bas
ed p
atte
rns
and
(age
s 6–
17;1
1)pe
ncil
Seq
uent
ial/Q
uant
itativ
e R
easo
ning
—vi
ew p
airs
of f
igur
es/n
umbe
rs a
ndde
term
ine
the
mis
sing
item
s (a
ge 6
–17:
11)
Rec
all o
f Des
igns
—dr
aw in
divi
dual
abs
trac
t des
igns
afte
r vi
ewin
g ea
chbr
iefly
(ag
e 6–
17;1
1)
Kau
fman
Ass
essm
ent
4;0–
12;6
No
35–5
0Y
es:
V/N
VF
ace
Rec
ogni
tion—
sele
ct m
atch
ing
face
s af
ter
seei
ng e
ach
for
5 s
$470
Bat
tery
for
Chi
ldre
ndfo
am(a
ge 4
)(K
–AB
C; K
aufm
an &
tria
ngle
s,H
and
Mov
emen
t—im
itate
han
d m
ovem
ents
mad
e by
the
exam
iner
Kau
fman
, 198
3)ph
oto
Tria
ngle
s—(t
imed
) re
plic
ate
pict
ured
geo
met
ric c
onfig
urat
ions
usi
ngca
rds,
&fo
am tr
iang
les
colo
r fo
rmM
atrix
Ana
logi
es—
sele
ct p
ictu
res
or c
olor
form
squ
ares
that
com
plet
esq
uare
sth
e ill
ustr
ated
ana
logi
es (
age
5–12
)S
patia
l Mem
ory—
iden
tify
the
posi
tions
of p
ictu
red
item
s af
ter
seei
ng th
emon
a ta
rget
pag
e fo
r 5
s (a
ge 5
–12)
Pho
to S
erie
s—pu
t pho
to c
ards
of a
sto
ry in
chr
onol
ogic
al o
rder
(age
6–1
2)N
ote.
Add
ition
al s
ubte
sts
com
bine
to fo
rm th
e S
eque
ntia
l and
Sim
ulta
neou
s P
roce
ssin
g S
cale
s, w
hich
toge
ther
form
the
Men
tal
Pro
cess
ing
Com
posi
te; A
chie
vem
ent s
ubte
sts
are
also
ava
ilabl
e.A
Spa
nish
ver
sion
of t
he te
st is
als
o av
aila
ble.
Leite
r In
tern
atio
nal
2;0–
20;1
1N
o40
Yes
:N
VF
igur
e G
roun
d—fin
d a
varie
ty o
f pic
ture
d ob
ject
s w
ithin
incr
easi
ngly
$939
Per
form
ance
Sca
le—
foam
com
plex
illu
stra
tions
Rev
ised
(Le
iter–
R;
shap
es,
Des
ign
Ana
logi
es—
(bon
us p
oint
s on
late
r ite
ms)
use
pic
ture
car
ds to
Roi
d &
Mill
er, 1
997)
com
plet
e vi
sual
ana
logi
es (
age
6–20
)
278 American Journal of Speech-Language Pathology • Vol. 13 • 275–290 • November 2004
TA
BL
E 1
(P
age
2 o
f 3)
. Su
mm
ary
of
no
nve
rbal
IQ m
easu
res.
Age
ran
geV
erba
lIQ
test
(yea
rs;m
onth
s)su
bsco
reT
imea
Man
ipul
ativ
esb
Inst
ruct
ions
Des
crip
tion
of n
onve
rbal
sub
test
s
C
ostc
Leite
r In
tern
atio
nal
pict
ure
For
m C
ompl
etio
n—as
sem
ble
foam
sha
pes
to m
atch
an
illus
trat
ion
or p
air
Per
form
ance
Sca
le—
card
spi
ctur
es o
f “di
ssec
ted”
obj
ects
with
pic
ture
s of
them
in w
hole
form
Rev
ised
(co
ntin
ued)
Mat
chin
g—m
atch
foam
sha
pes
or p
ictu
re c
ards
to a
n ar
ray
of o
bjec
tspi
ctur
ed o
n an
eas
el (
age
2–10
)S
eque
ntia
l Ord
er—
sele
ct th
e pi
ctur
e ca
rd th
at b
est c
ompl
etes
the
illus
trat
ed a
rray
Rep
eate
d P
atte
rns—
dete
rmin
e w
hich
foam
sha
pe o
r pi
ctur
e ca
rd b
est
com
plet
es a
n ill
ustr
ated
pat
tern
of s
hape
s or
obj
ects
Cla
ssifi
catio
n—as
soci
ate
foam
sha
pes
or p
ictu
re c
ards
with
illu
stra
ted
item
s ac
cord
ing
to s
imila
ritie
s in
col
or, s
ize,
func
tion,
or
shap
e (a
ge 2
–5)
Pap
er F
oldi
ng—
(bon
us p
oint
s on
late
r ite
ms)
look
at a
n ob
ject
pic
ture
d on
a ca
rd a
nd s
elec
t the
illu
stra
tion
of w
hat t
hat o
bjec
t wou
ld lo
ok li
ke if
fold
ed (
age
6–20
)N
ote.
Als
o of
fers
an
Atte
ntio
n an
d M
emor
y B
atte
ry w
ith 1
0 di
ffere
ntno
nver
bal s
ubte
sts.
A S
pani
sh v
ersi
on o
f the
test
is a
lso
avai
labl
e.
Nag
lieri
Non
verb
al5;
0–17
;11
No
25–3
0N
oV
Sol
ve fi
gura
l mat
rix it
ems
by id
entif
ying
a p
atte
rn a
nd s
elec
ting
an a
nsw
er$2
33A
bilit
y T
est (
NN
AT
;to
com
plet
e th
e pa
ttern
.N
aglie
ri, 2
003)
Not
e. P
aral
lel f
orm
s ar
e av
aila
ble.
Pic
toria
l Tes
t of
3;0–
8;11
No
15–3
0N
oV
Ver
bal A
bstr
actio
ns—
Fin
d th
e pi
ctur
ed it
em th
at is
bei
ng la
bele
d or
$143
Inte
llige
nce—
Sec
ond
desc
ribed
Edi
tion
(PT
I–2;
Fre
nch,
For
m D
iscr
imin
atio
n—lo
ok a
t the
pic
ture
d ta
rget
item
and
iden
tify
the
2001
)as
soci
ated
item
from
a p
ictu
red
disp
lay
Qua
ntita
tive
Con
cept
s—fin
d th
e pi
ctur
ed it
em th
at is
ass
ocia
ted
with
the
verb
al d
escr
iptio
nN
ote.
Tes
t was
des
igne
d to
acc
omm
odat
e ey
e ga
ze r
espo
nses
.
Rav
en’s
Sta
ndar
d6–
adul
tN
o45
Yes
: pen
cils
V/N
VC
ompl
ete
5 se
ts o
f 12
illus
trat
ed p
uzzl
es b
y se
lect
ing
from
an
arra
y of
£129
Pro
gres
sive
Mat
rices
—po
tent
ial a
nsw
ers
(e.g
., se
lect
the
piec
e th
at c
ompl
etes
the
illus
trat
edP
aral
lel f
orm
(S
PM
–P;
patte
rn)
Rav
en, R
aven
, &N
ote.
Add
ition
al fo
rms
of th
e S
tand
ard
Pro
gres
sive
Mat
rices
com
pris
eC
ourt
, 199
8)th
e C
olou
red
Mat
rices
and
the
Adv
ance
d P
rogr
essi
ve M
atric
es.
Rey
nold
s In
telle
ctua
l3–
94Y
es10
–15
No
VO
dd It
em O
ut—
(tim
ed)
Sel
ect p
ictu
re fr
om a
rray
of 5
–7 p
ictu
res
that
$319
Ass
essm
ent S
cale
sm
akes
it d
iffer
ent t
han
the
othe
rs(R
IAS
; Rey
nold
s &
Wha
t’s M
issi
ng—
(tim
ed)
Iden
tify
impo
rtan
t par
t of p
ictu
red
obje
ctK
amph
aus,
200
3)th
at is
mis
sing
Sta
nfor
d–B
inet
Inte
llige
nce
2–89
;11
No
30Y
es: p
last
icV
Obj
ect S
erie
s/M
atric
es—
solv
e no
vel f
igur
al p
robl
ems,
seq
uenc
es o
f$8
58S
cale
—F
ifth
Edi
tion
form
boa
rdpi
ctur
ed o
bjec
ts, o
r m
atrix
-typ
e pa
ttern
s by
poi
ntin
g to
pic
ture
d ob
ject
s(S
B5;
Roi
d, 2
003)
shap
es,
or p
icki
ng u
p ta
rget
obj
ects
, suc
h as
form
boa
rd s
hape
, cou
ntin
g ro
d, o
rco
untin
gbl
ock
(ser
ves
as th
e ro
utin
g su
btes
t)ro
ds, 1
-in.
Pro
cedu
ral K
now
ledg
e—id
entif
y co
mm
on s
igna
ls a
nd a
ctio
nspl
astic
blo
cks,
Pic
ture
Abs
urdi
ties—
sele
ct m
issi
ng o
r ab
surd
det
ails
in p
ictu
res
penc
ilQ
uant
itativ
e R
easo
ning
—so
lve
prem
ath,
arit
hmet
ic, a
lgeb
raic
or
func
tiona
l con
cept
s de
pict
ed b
y se
lect
ing
bloc
ks, c
ount
ing
rods
, or
choo
sing
from
illu
stra
tions
For
m B
oard
—vi
sual
ize
and
solv
e sp
atia
l and
figu
ral p
uzzl
elik
e pr
oble
ms
usin
g fo
rm b
oard
sha
pes
For
m P
atte
rns—
(tim
ed)
com
plet
e pa
ttern
s by
mov
ing
plas
tic fo
rm b
oard
shap
es in
to p
lace
DeThorne & Schaefer: Nonverbal IQ 279
TA
BL
E 1
(P
age
3 o
f 3)
. Su
mm
ary
of
no
nve
rbal
IQ m
easu
res.
Age
ran
geV
erba
lIQ
test
(yea
rs;m
onth
s)su
bsco
reT
imea
Man
ipul
ativ
esb
Inst
ruct
ions
Des
crip
tion
of n
onve
rbal
sub
test
s
C
ostc
Sta
nfor
d–B
inet
Inte
llige
nce
Del
ayed
Res
pons
e—fin
d an
obj
ect t
hat h
as b
een
plac
ed b
y th
e ex
amin
erS
cale
—F
ifth
Edi
tion
unde
r 1
of 3
cup
s(c
ontin
ued)
Blo
ck S
pan—
tap
sequ
ence
s of
blo
cks
in th
e sa
me
orde
r as
mod
eled
by
the
exam
iner
Not
e. D
epen
ding
on
exam
inee
’s s
kill
leve
l (as
det
erm
ined
via
the
Obj
ect S
erie
s/M
atric
es r
outin
g su
btes
t), n
ot a
ll su
btes
ts a
re a
dmin
iste
red.
Tes
t of N
onve
rbal
6;0–
89;1
1N
o15
–20
No
NV
Abs
trac
t Fig
ural
Pro
blem
-Sol
ving
—se
lect
from
an
arra
y of
pic
ture
d ite
ms
$299
Inte
llige
nce—
Thi
rdto
com
plet
e th
e pi
ctur
ed r
ule-
base
d m
atrix
or
figur
e se
ries
Edi
tion
(TO
NI–
3;N
ote.
Tw
o pa
ralle
l for
ms
are
avai
labl
e.B
row
n, S
herb
enou
,&
Joh
nsen
, 199
7)
Uni
vers
al N
onve
rbal
5;0–
17;1
1N
o30
Yes
: pla
stic
NV
Sym
bolic
Mem
ory—
mod
el a
pic
ture
d se
ries
of s
ymbo
ls u
sing
pla
stic
chi
ps$5
79In
telli
genc
e T
est
chip
s, 1
-in.
Spa
tial M
emor
y—us
e ci
rcul
ar r
espo
nse
chip
s to
rep
licat
e a
patte
rn o
f(U
NIT
; Bra
cken
&cu
bes,
dots
afte
r a
brie
f exp
osur
e to
a p
ictu
red
targ
et p
age
McC
allu
m, 1
998)
circ
ular
Cub
e D
esig
n—(t
imed
) us
e cu
bes
to c
onst
ruct
a th
ree-
dim
ensi
onal
des
ign
resp
onse
that
mat
ches
a p
ictu
red
targ
etch
ips,
pen
cil
Ana
logi
c R
easo
ning
—co
mpl
ete
a m
atrix
ana
logy
by
sele
ctin
g fr
om fo
urpo
ssib
le p
ictu
red
resp
onse
sN
ote.
Tw
o ad
ditio
nal s
ubte
sts
are
avai
labl
e fo
r th
e te
st’s
ext
ende
d ba
ttery
.
Wec
hsle
r A
bbre
viat
ed6–
89Y
es15
Yes
: 1-in
.V
Blo
ck D
esig
n—(t
imed
+ b
onus
poi
nts)
con
stru
ct a
des
ign
with
blo
cks
to$2
35S
cale
of I
ntel
ligen
cebl
ocks
mat
ch a
giv
en ta
rget
stim
ulus
(WA
SI;
Psy
chol
ogic
alM
atrix
Rea
soni
ng—
(tim
ed)
choo
se th
e st
imul
us to
com
plet
e th
e ta
rget
Cor
pora
tion,
199
9)m
atrix
or
patte
rn s
eque
nce
Wec
hsle
r In
telli
genc
e6–
16;1
1Y
es25
Yes
: 1-in
.V
Blo
ck D
esig
n—(t
imed
+ b
onus
pts
.) c
onst
ruct
a d
esig
n w
ith b
lock
s to
$799
Sca
le fo
r C
hild
ren—
bloc
ksm
atch
a g
iven
targ
et s
timul
usF
ourt
h E
ditio
nP
ictu
re C
once
pts—
iden
tify
two
to th
ree
pict
ures
for
diffe
rent
pic
ture
arr
ays
(WIS
C–I
V;
that
bes
t go
toge
ther
Wec
hsle
r, 2
003)
Mat
rix R
easo
ning
—se
lect
stim
ulus
to c
ompl
ete
targ
et m
atrix
Not
e. S
uppl
emen
tal n
onve
rbal
sub
test
s ar
e av
aila
ble.
A S
pani
sh v
ersi
onof
the
test
is d
ue fo
r re
leas
e in
the
fall
of 2
004.
Wec
hsle
r P
resc
hool
2;6–
7;3
Yes
25Y
es: 1
-in.
VO
bjec
t Ass
embl
y—(t
imed
) co
nstr
uct a
figu
re g
iven
car
dboa
rd p
uzzl
e$7
99an
d P
rimar
ybl
ocks
,pi
eces
(ag
e 2;
6–3;
11)
Sca
le o
f Int
ellig
ence
—ca
rdbo
ard
Blo
ck D
esig
n—(t
imed
) co
nstr
uct a
des
ign
with
blo
cks
to m
atch
a m
odel
Thi
rd E
ditio
npu
zzle
pres
ente
d in
the
stim
ulus
boo
k(W
PP
SI–
III;
piec
esM
atrix
Rea
soni
ng—
sele
ct fr
om a
n ar
ray
of p
ictu
red
item
s to
com
plet
e th
eW
echs
ler,
200
2)pi
ctur
ed r
ule-
base
d m
atrix
(ag
e 4–
7;3)
Pic
ture
Con
cept
s—se
lect
the
two
pict
ures
that
go
best
toge
ther
from
2ro
ws
of p
ictu
red
item
s (a
ge 4
–7;3
)
Wid
e R
ange
Inte
llige
nce
4–85
Yes
20–3
0Y
es:
V/N
VM
atric
es—
(tim
ed)
sele
ct p
ictu
re th
at c
ompl
etes
rul
e-ba
sed
desi
gn s
erie
s$2
50T
est (
WR
IT; G
lutti
ng,
diam
ond
Dia
mon
ds—
(tim
ed)
cons
truc
t spe
cific
des
igns
usi
ng s
ingl
e or
mul
tiple
Ada
ms,
& S
hesl
ow,
chip
sdi
amon
d-sh
aped
chi
ps20
00)
Not
e. V
= v
erba
l; N
V =
non
verb
al.
a Tim
e es
timat
e fo
r no
nver
bal c
ompo
nent
s on
ly. b R
efer
s to
the
obje
cts
that
the
exam
inee
is r
equi
red
to m
anip
ulat
e du
ring
test
adm
inis
trat
ion.
c Cos
t as
of J
anua
ry 6
, 200
4, ta
ken
from
Web
site
s or
test
pub
lishe
r ca
talo
gs in
Jan
uary
200
4. d C
urre
ntly
und
ergo
ing
revi
sion
.
280 American Journal of Speech-Language Pathology • Vol. 13 • 275–290 • November 2004
the Symbolic Memory subtest to help herself remember theorder of presentation. Another example comes from theLeiter–R Figure Ground subtest (administered by the firstauthor) as a 6-year-old boy was scanning an illustration ofclowns in search of a small plus-like design when heremarked, “I can’t see no band-aid.” Labeling the plus-likedesign as a band-aid likely served as a heuristic of sorts(albeit an unsuccessful one in this case), which could serveto focus his visual scan on areas within the illustrationwhere a bandage might occur (i.e., on one of the clown’sbodies). As these examples illustrate, linguistic ability islikely to influence all nonverbal IQ measures to someextent. However, the influence is minimized when both theinstructions and the responses are provided nonverbally.
Description of Nonverbal ComponentsColumn 6 within Table 1 contains a brief description of
the nonverbal tasks or subtests included within each IQmeasure. For most IQ measures, column 6 does notprovide an exhaustive list of all the available subtests. Aspreviously mentioned, many measures in Table 1 offer averbal scale as well. Additional components, such asassociated batteries and subtests that do not qualify as acore component of either the verbal or nonverbal subscore,have been noted in column 6.
Embedded within the subtest descriptions in column 6is information regarding (a) applicable ages and (b) timeconstraints. If the age range for a specific subtest does notdiffer from the age range provided in column 1, then noadditional age information is provided for that subtest incolumn 6. For example, the Leiter–R Visualization andReasoning Battery (Roid & Miller, 1997) includes a totalof 10 nonverbal subtests that combine to form the IQmeasure. However, only 7 are administered to anyparticular child depending on that child’s age. Note thatthe age ranges provided in column 6 for each subtest referto the use of that subtest as a core component of the IQscore, not for its use as a supplemental subtest. The age atwhich individual subtests can be administered as supple-mental subtests may vary from the information presentedin Table 1.
In addition to information regarding age, it is notedwithin the subtest description if the examinee’s response issubject to time constraints. For example, the Trianglessubtest of the K–ABC (Kaufman & Kaufman, 1983) allots2 min per item. If the examinee replicates the desiredmodel but exceeds this time allotment, the item is scored asincorrect. In addition, some subtests assign bonus points,depending on the speed with which the task is completed.Bonus points can exist regardless of whether the subtest issubject to time constraints. For example, the Paper Foldingsubtest of the Leiter–R (Roid & Miller, 1997) does notrequire examinees to select the correct response within aset time frame to receive credit. However, on later itemsthe examinees can receive additional points for timelyperformance. When the examinee’s response has to becompleted within a set time period, the word timed appearsin parentheses at the beginning of the subtest description inTable 1. Similarly, any subtest that awards bonus points for
quick responses is marked by bonus points at the beginningof the subtest description.
Psychometric PropertiesAny overview of nonverbal IQ measures would be
incomplete without consideration of the tests’ normativeconstruction and psychometric properties. The AmericanEducational Research Association, American Psychologi-cal Association, and National Council on Measurement inEducation worked together to revise their standards foreducational and psychological testing in 1999. Theresulting standards summarize both the general principlesand specific standards underlying sound test developmentand use. Test developers hold the responsibility for soundinstrument development, including (a) collection of anappropriate normative sample and (b) documentation ofadequate reliability and validity. Based in large part onthese professional standards, we have evaluated eachnonverbal measure in terms of its normative sample,reliability, and forms of validity evidence. A summary ofour evaluation is provided in Table 2 with related informa-tion offered in the accompanying text. Our review ofpsychometrics is admittedly cursory, and readers arereferred to additional publications for more in-depthinformation (e.g., Athanasiou, 2000; McCauley, 2001).
Normative SampleDevelopers of norm-referenced tests must identify and
select appropriate normative samples. Normative samplesconsist of the participants to whom individual examinees’performances will be compared. The normative sample ofeach measure listed in Table 2 was evaluated in terms of its(a) size, (b) representativeness, and (c) recency. We ratedeach aspect of the normative sample as positive (+) ornegative (–) when information was available. Whenadequate information was not provided in the test manualto evaluate a particular aspect of the normative sample, 0appears in the relevant column of Table 2.
Size. Reasonably large numbers of children are pre-ferred to minimize error and achieve a representativesample. To take an extreme example, imagine trying tofind ten 6-year-olds that are representative of all 6-year-olds currently living within the United States! To evaluateeach measure in terms of sample size, we referred toSattler’s (2001) recommendation of a 100-participantminimum for each child age group included in the norma-tive sample. An age group was defined by a 1-yearinterval. Of the 16 tests listed in Table 2, 10 successfullymet the criteria for subsample size. Of the 5 measures thatfailed, the Leiter–R (Roid & Miller, 1997) included lessthan 100 children in groups age 8, 10, and 11–20, withnumbers reaching as low as 45 for some of the 12–17-year-old age groups. Within the normative sample for theNaglieri Nonverbal Ability Test (NNAT; Naglieri, 2003),the 16- and 17-year-olds were combined for a subsamplesize of 100. Similarly, the 16- and 17-year-olds werecombined in the normative data from the UNIT (Bracken& McCallum, 1998) for a subsample total of 175, and the
DeThorne & Schaefer: Nonverbal IQ 281
TA
BL
E 2
(P
age
1 o
f 2)
. Eva
luat
ion
of
each
mea
sure
bas
ed o
n s
pec
ific
psy
cho
met
ric
crit
eria
.
Nor
mat
ive
sam
ple
R
elia
bilit
yV
alid
ity
IQ te
stS
ize
Rep
Rec
Int
T-R
Inte
rS
EM
Dev
Tes
t com
pF
AG
roup
com
pP
red
evA
dditi
onal
rev
iew
s
Col
umbi
a M
enta
l Mat
urity
+–
––
–np
+np
√np
npnp
Ege
land
(19
78),
Kau
fman
(19
78)
Sca
le—
Thi
rd E
ditio
n(C
MM
S–3
;B
urge
mei
ster
,H
olla
nder
, & L
orge
,19
72)
Com
preh
ensi
ve T
est o
f+
++
+–
++
√√
√√
npA
than
asio
u (2
000)
, Ayl
war
d (1
998)
,N
onve
rbal
Inte
llige
nce
Bra
cken
& N
aglie
ri (2
003)
,(C
TO
NI;
Ham
mill
,B
radl
ey-J
ohns
on (
1997
), D
ross
man
&P
ears
on, &
Mal
ler
(200
0), M
cCal
lum
(20
03),
Van
Wie
derh
olt,
1997
)Li
ngen
(19
98)
Diff
eren
tial A
bilit
y+
++
––
np–
np√
√√
npA
ylw
ard
(199
2), C
. D. E
lliot
t (19
90b)
, S. N
.S
cale
s (D
AS
;E
lliot
t (19
90),
Fla
naga
n &
Alfo
nso
(199
5),
C. D
. Elli
ott,
1990
a)P
latt,
Kam
phau
s, K
eltg
en, &
Gill
iland
(199
1), R
eine
hr (
1992
)
Kau
fman
Ass
essm
ent
++
––
–np
+√
√√
np√
Ana
stas
i (19
85),
Bra
cken
(19
85),
Bat
tery
for
Chi
ldre
nC
aste
llano
s, K
line,
& S
nyde
r (1
996)
,(K
–AB
C; K
aufm
an &
Cof
fman
(19
85),
Con
oley
(19
90),
Hes
sler
Kau
fman
, 198
3a )(1
985)
, Hop
kins
& H
odge
(19
84),
Kam
phau
s (1
990)
, Kam
phau
s &
Rey
nold
s(1
984)
, Kei
th (
1985
), M
ehre
ns (
1984
),P
age
(198
5), S
tern
berg
(19
83, 1
984)
Leite
r In
tern
atio
nal
–0
+0
0np
–√
√√
√np
Ath
anas
iou
(200
0), B
rack
en &
Nag
lieri
Per
form
ance
Sca
le—
(200
3), F
arre
ll &
Phe
lps
(200
0), M
arco
Rev
ised
(Le
iter–
R;
(200
1), M
cCal
lum
(20
03),
Stin
nett
(200
1)R
oid
& M
iller
, 199
7)
Nag
lieri
Non
verb
al A
bilit
y–
++
––
np+
np√
np√
npM
cCal
lum
(20
03)
Tes
t (N
NA
T; N
aglie
ri,20
03)
Pic
toria
l Tes
t of
+0
++
0+
+√
√√
√np
Ath
anas
iou
(200
3), F
lana
gan
&In
telli
genc
e—S
econ
dC
alta
bian
o (2
003)
Edi
tion
(PT
I–2;
Fre
nch,
200
1)
Rav
en’s
Sta
ndar
d–
––
––
np–
np√
√np
√B
ortn
er (
1965
), B
rack
en &
Nag
lieri
(200
3),
Pro
gres
sive
Mat
rices
—M
cCal
lum
(20
03)
Par
alle
l for
m (
SP
M–P
;R
aven
, Rav
en, &
Cou
rt, 1
998)
Rey
nold
s In
telle
ctua
l–
++
0–
+–
√√
√√
npA
sses
smen
t Sca
les
(RIA
S; R
eyno
lds
&K
amph
aus,
200
3)
Sta
nfor
d–B
inet
+–
++
–np
+√
√√
√np
Inte
llige
nce
Sca
le—
Fift
h E
ditio
n (S
B5;
Roi
d, 2
003)
282 American Journal of Speech-Language Pathology • Vol. 13 • 275–290 • November 2004
TA
BL
E 2
(P
age
2 o
f 2)
. Eva
luat
ion
of
each
mea
sure
bas
ed o
n s
pec
ific
psy
cho
met
ric
crit
eria
.
Nor
mat
ive
sam
ple
R
elia
bilit
yV
alid
ity
IQ te
stS
ize
Rep
Rec
Int
T-R
Inte
rS
EM
Dev
Tes
t com
pF
AG
roup
com
pP
red
evA
dditi
onal
rev
iew
s
Tes
t of N
onve
rbal
++
+–
–+
+√
√√
√np
Ath
anas
iou
(200
0), A
tlas
(200
1), D
eMau
roIn
telli
genc
e—T
hird
(200
1), M
cCal
lum
(20
03)
Edi
tion
(TO
NI–
3;B
row
n, S
herb
enou
,&
Joh
nsen
, 199
7)
Uni
vers
al N
onve
rbal
–+
+–
–np
+np
√√
√np
Ath
anas
iou
(200
0), B
anda
los
(200
1),
Inte
llige
nce
Tes
tB
rack
en &
Nag
lieri
(200
3), F
arre
ll &
(UN
IT; B
rack
en &
Phe
lps
(200
0), F
ives
& F
lana
gan
(200
2),
McC
allu
m, 1
998)
McC
allu
m (
2003
), Y
oung
& A
ssin
g (2
000)
Wec
hsle
r A
bbre
viat
ed+
++
+–
0+
np√
√√
npK
eith
(20
01),
Lin
ksko
g &
Sm
ith (
2001
)S
cale
of I
ntel
ligen
ce(W
AS
I; P
sych
olog
ical
Cor
pora
tion,
199
9)
Wec
hsle
r In
telli
genc
e+
++
+–
++
np√
√√
npS
cale
for
Chi
ldre
n—F
ourt
h E
ditio
n(W
ISC
–IV
;W
echs
ler,
200
3)
Wec
hsle
r P
resc
hool
and
++
+–
–+
+np
√√
√np
Ham
ilton
& B
urns
(20
03)
Prim
ary
Sca
le o
fin
telli
genc
e—T
hird
Edi
tion
(WP
PS
I–III
;W
echs
ler,
200
2)
Wid
e R
ange
Inte
llige
nce
–0
+–
0np
–√
√√
npnp
Stin
nett
(200
3), W
idam
an (
2003
)T
est (
WR
IT; G
lutti
ng,
Ada
ms,
& S
hesl
ow,
2000
)
Not
e. R
ep =
rep
rese
ntat
iven
ess;
Rec
= r
ecen
cy; I
nt =
inte
rnal
; T-R
= te
st–r
etes
t; In
ter
= in
terr
ater
; Dev
= d
evel
opm
enta
l; T
est c
omp
= te
st c
ompa
rison
; FA
= fa
ctor
ana
lysi
s; G
roup
com
p=
gro
up c
ompa
rison
; Pre
d ev
= p
redi
ctiv
e ev
iden
ce; +
= s
peci
fied
crite
ria m
et; –
= s
peci
fied
crite
ria n
ot m
et; n
p =
suc
h ev
iden
ce n
ot p
rovi
ded
with
in th
e te
xt m
anua
l; √
= p
rese
nt; 0
=in
adeq
uate
info
rmat
ion
prov
ided
by
the
test
man
ual.
DeThorne & Schaefer: Nonverbal IQ 283
17- and 18-year-olds were combined within the WideRange Intelligence Test (WRIT; Glutting, Adams, &Sheslow, 2000) for a subsample size of 125. Within thenormative data from the Reynolds Intellectual AssessmentScales (RIAS; Reynolds & Kamphaus, 2003), children age11 to 19 years have been combined to form subsamplesranging in size from 156 to 167 per 2–3-year grouping.The Raven’s Standard Progressive Matrices—ParallelForm (SPM–P; Raven, Raven, & Court, 1998) also failedto meet the sample-size criteria given that its normativesample of American children was derived by combiningsmaller local norms for which individual sample sizes arenot reported.
Representativeness. Related but distinct from the issueof sample size is the property of representativeness. Therepresentativeness of a normative sample is critical inevaluating the usefulness of an instrument for any specificpopulation. Since these tests were not designed for anyparticular group, normative samples should be collectedand stratified to approximate the population percentagesfrom the U.S. Bureau of the Census (2001). To receive apositive rating for representativeness in Table 2, a measurehad to meet two criteria. First, the measure’s normativesample needed to be expressly stratified on the followingvariables: race/ethnicity, geographic region, parenteducational level/socioeconomic status, and sex. Second,the test’s normative sample needed to include children withdisabilities as appropriate given the test’s administrationprocedures. For example, it made sense to exclude childrenwhose visual or motor impairments would have limitedtheir performance. However, a sample that excludeschildren with disabilities as a general rule was not consid-ered representative within our review (see McFadden,1996, for rationale).
All reviewed measures except for the SPM–P (Raven etal., 1998) attempted to derive a nationally representativesample in regard to race/ethnicity, geographic region, parenteducational level/socioeconomic status, and sex. Themeasures in Table 2 were more variable in regard to whetherchildren with disabilities were included. Two measuresreceived a negative rating for representativeness becausethey excluded children with disabilities altogether: theCMMS–3 (Burgemeister et al., 1972) and the Stanford–Binet Intelligence Scale (5th ed.; SB5; Roid, 2003). Simi-larly, the Leiter–R (Roid & Miller, 1997) and the WRIT(Glutting et al., 2000) received an uncertain rating, identifiedas 0 in Table 2, because they did not specify within themanual whether children with disabilities were included inthe normative sample. Although the second edition of thePictorial Test of Intelligence (PTI–2; French, 2001) includedchildren with disabilities, it received an uncertain rating aswell due to the relatively sparse information it provided onits normative sample and concerns regarding the extent towhich the sample characteristics differed from U.S. Bureauof the Census data (see reviews by Athanasiou, 2003;Flanagan & Caltabiano, 2003).
Recency. In addition to being large and representative,normative samples should also be relatively recent. TheFlynn (1999) effect has revealed trends in IQ scores acrosstime such that IQs have increased by 3 points per decade.
Consequently, reliance on instruments with outdated normsmay overestimate youths’ IQ (see Neisser et al., 1996, foradditional discussion). In keeping with Bracken (2000), weconsidered the normative data to be acceptably recent if ithad been collected within the last 15 years. Given thiscriterion, normative samples from the CMMS–3(Burgemeister et al., 1972) and the K–ABC (Kaufman &Kaufman, 1983) were considered out-of-date. Note,however, that a revised version of the K–ABC is currentlyunder way.
ReliabilityIn addition to identifying and selecting appropriate
normative samples, test developers must demonstrate thattheir tests have adequate reliability and validity to justifytheir proposed use. In classical test theory, reliability, asmeasured via correlation coefficients, represents the degreeto which test scores are consistent across items (i.e.,internal consistency), across time (i.e., test–retest), andacross examiners (i.e., interrater). Accordingly, weevaluated each measure in terms of internal reliability,test–retest reliability, and interrater reliability. A fourthfactor, the availability of standard error of measurement foreach summary score, was also considered. Given the focusof the present article on nonverbal summary scores ratherthan subtests, reliability evidence on subtest scores was notevaluated. Identical to our rating system for the normativesample, each form of reliability evidence was evaluated aspositive or negative in Table 2. When the test manual didnot provide adequate information for us to determinewhether a specific criterion was met, 0 appears in therelevant column of Table 2. The reader is referred toMcCauley (2001) for additional information regarding theconstructs of reliability and validity.
Internal reliability. Internal reliability was ratedpositively within Table 2 if uncorrected internal reliabilitycoefficients reached .90 or above for the nonverbalsummary score within each child age group. This criterionfor reliability coefficients is consistent with generalrecommendations for test construction (see Bracken, 1988;McCauley & Swisher, 1984a; Salvia & Ysseldyke, 1998).Only 5 of the 16 measures met the established criterion forinternal reliability. Of those that did not, the DAS (C. D.Elliott, 1990a); Leiter–R (Roid & Miller, 1997); RIAS(Reynolds & Kamphaus, 2003); and WRIT (Glutting et al.,2000) received uncertain marks because they failed toprovide separate reliability coefficients for each age group.Although these measures reported reliability coefficients oncombined age groups, such coefficients can mask substantialvariability. Of the measures that provided coefficients ateach age group, 7 failed to consistently meet .90, althoughnone fell too far from the mark.
Test–retest reliability. The second criterion for testreliability, in keeping with Bracken (1987), was a test–retest coefficient of .90 or greater for the nonverbalsummary score at each child age group. In addition, theretesting had to occur at least 2 weeks, on average, fromthe initial test administration. Not a single nonverbal IQmeasure from Table 2 met this criterion. Three measures,
284 American Journal of Speech-Language Pathology • Vol. 13 • 275–290 • November 2004
the WRIT (Glutting et al., 2000), Leiter–R (Roid & Miller,1997), and PTI-2 (French, 2001), received uncertain marksfor test–retest reliability because they reported coefficientsfor collapsed age groups, thereby making reliabilityestimates for individual age groups impossible to deter-mine. Most measures from Table 2 reported coefficientsfrom collapsed or select age groups but received a failedmark (as opposed to uncertain) because the reportedcoefficients did not reach .90. Only three measurespresented individual test–retest coefficients for each agegroup: the CTONI (Hammill et al., 1997), which reportedcoefficients of .79 to .94; the UNIT (Bracken &McCallum, 1998), which reported coefficients of .83 to.90; and the WISC–IV (Wechsler, 2002), which reportedcoefficients of .85 to .93 across age groups.
Interrater reliability. The third criterion for evaluationof test reliability was an interrater reliability of .90 orgreater for the nonverbal summary score. We want to beclear that this form of reliability refers only to the consis-tency of scoring across examiners and does not evaluatethe consistency of administration across examiners. Thelatter is a source of error that should be assessed but wasnot reported by any of the reviewed test manuals. Interraterreliability of scoring procedures for the nonverbal compos-ite was reported for 7 of the 16 measures: the CTONI(Hammill et al., 1997); PTI–2 (French, 2001); RIAS(Reynolds & Kamphaus, 2003); TONI–3 (Brown et al.,1997); Wechsler Abbreviated Scale of Intelligence (WASI;Psychological Corporation, 1999); WISC–IV (Wechsler,2003); and Wechsler Preschool and Primary IntelligenceScale—Third Edition (WPPSI–III; Wechsler, 2002). All ofthese received a positive rating for meeting the proposedcriterion except the WASI, which received an uncertainmark due to its relatively vague statement that interscoreragreement for the nonverbal subtests “tends to be in thehigh .90s” (Psychological Corporation, 1999, p. 129). TheDAS (C. D. Elliott, 1990a) presented interrater reliabilityinformation, but only for particular subtests.
Standard error of measurement. The fourth criterion forevaluating test reliability refers to the test manual’spresentation of standard error of measurement (SEM).Whereas the reliability coefficients can inform us aboutgroup-level consistency, such correlations are not directlyapplicable to individual scores. Instead, the SEM is needed.SEM is a derivative of the reliability coefficient that takesinto account the scale of the measurement used, thestandard deviation, and the internal consistency of thescore. As such, the SEM provides some sense of how mucherror is likely encompassed within the obtained score (seeMcCauley & Swisher, 1984b, for additional detail). Forexample, on the UNIT (Bracken & McCallum, 1998), theStandard Battery Full-Scale score SEM is approximately 4points. Consequently if a child received a score of 87, onecould feel reasonably confident that the child’s true scorewould be captured within the range of 87 ±4 points (i.e.,83–91 is the 68% confidence interval). Given that thederivation of confidence intervals relies on SEM, the finalcriterion for test reliability was that the SEM be presentedfor the nonverbal summary score at each child age group.As glimpsed from Table 2, 11 of the 16 tests presented
SEM by age for the nonverbal summary score. Of the 5measures that failed this criterion, the Leiter–R (Roid &Miller, 1997); RIAS (Reynolds & Kamphaus, 2003); andWRIT (Glutting et al., 2000) provided SEMs for combinedage groups only. The DAS (C. D. Elliott, 1990a) presentedSEMs for individual age groups, but only for children age3;6–7;11. The SPM–P (Raven et al., 1998) did not provideSEM values at all.
ValidityAlthough evidence for test validity can take many forms,
validity as an overall concept represents the utility of a testscore in measuring the construct it was designed to measure(McCauley, 2001; Messick, 1989). For IQ tests, developersshould clearly articulate the theoretical grounding of theirinstrument and document evidence for validity from avariety of sources. Table 2 illustrates the validity evidencepresented within each test manual. The presence of eachform of validity evidence within the test manual is indicatedby a √ in Table 2. The letters np appear in the relevantcolumns of Table 2 to indicate which forms of validityevidence were not provided for any particular measure.
Table 2 reviews each measure according to whether itprovided the following most common forms of constructvalidity evidence: (a) developmental trends, (b) correlationwith similar tests, (c) factor analyses, (d) group compari-sons, and (e) predictive ability. Although demonstration ofinternal consistency is another commonly used form ofvalidity evidence, this evidence was already covered underthe discussion and evaluation of each test’s internalreliability.
Developmental trends. Demonstrating that a test’s rawscores increase with age provides necessary, but by nomeans sufficient, evidence of a test’s validity. Althoughcognitive abilities increase with age, so do many otherqualities—attention, language use, and height, to name justa few. Perhaps in part because it represents a relativelyweak form of validity evidence, only 8 of the 16 measuresexplicitly mention evidence of developmental change intheir discussions of validity (see Table 2).
Correlation with similar tests. In contrast to therelatively sparse evidence of developmental trends, themost common form of validity evidence appeared to beinformation regarding correlations with other measures.All 16 tests from Table 2 presented positive correlationswith other measures of general cognition. In addition, therewas a general tendency to report positive correlations withachievement tests as a form of validity evidence. Ofcourse, the strength of such intertest correlations indemonstrating test validity is dependent on the validity ofthe test being used for comparison. For example, two testsdeveloped to assess general cognition might be highlycorrelated if both reflect the same confounding variable(e.g., language skill or test-taking abilities). Given that thesame tests tend to benchmark with each other, intertestcorrelation evidence is limited by an air of circularity.
Factor analyses. Another commonly used form ofvalidity evidence is the presentation of factor analyses,either confirmatory, exploratory, or both. Without getting
DeThorne & Schaefer: Nonverbal IQ 285
into much detail, factor analysis is a statistical procedurebased on item or subscale intercorrelations that is used todetermine whether the test components load on the abilitiesof interest (see McCauley, 2001, for additional information).Such evidence was presented by 14 of the 16 measures. Ofcourse, the limitation of such evidence is that just because atest taps a relatively unitary construct or constructs does notnecessarily mean it’s the right construct.
Group comparisons. Given the limitations of the formsof evidence previously mentioned, additional validitysupport is contributed through comparison studies ofspecific subgroups—specifically, subgroups with clearassociations with IQ (e.g., children identified as gifted orwith diagnosed cognitive deficits). In other words, themeasure of interest is administered to subgroups ofchildren with previously established diagnoses to deter-mine how the descriptive data from such subgroupscompare with the test’s normative data. Children withpreviously identified cognitive deficits are expected toscore below the normative mean, whereas childrenpreviously identified as gifted would be expected to scoreabove the normative mean. Of course, the extent to whichsubsample means are expected to deviate from the norma-tive mean is less clear, and even clear mean differences canmask a substantial amount of individual variability.Nonetheless, 12 of 16 measures provided validity evidencebased on such subgroup comparisons (see Table 2). Forexample, the manual from the SB5 (Roid, 2003) reportedthat children previously diagnosed with mental retardationscored 2–3 SD below the normative mean on the SB5,whereas children previously classified as gifted scored 1–2SD above the normative mean.
Predictive ability. Predictive evidence, perhaps one of thestrongest forms of validity evidence, is reported least often.Predictive validity evidence encompasses the effectivenessof a test in predicting the performance of individuals inrelated activities. For example, a measure might report theextent to which its scores predict children’s educationalplacement or later academic achievement. Obviously,collecting such evidence requires laborious longitudinalinvestigation with potentially unclear results. As such,predictive evidence is provided by only 2 of the 16 mea-sures, the K–ABC (Kaufman & Kaufman, 1983) and theSPM–P (Raven et al., 1998). The K–ABC presents resultsfrom six studies that used the K–ABC to predict scores onachievement tests administered 6–12 months later. Similarlythe SPM–P reports correlations “ranging up to about .70”between scores from the SPM–P and scores from achieve-ment tests administered “some time” later (Raven et al.,1998, p. 29). Additional measures, such as the SB5 (Roid,2003) and UNIT (Bracken & McCallum, 1998), presentedcorrelation data with concurrently administered achievementtests as evidence of predictive validity. However, theconcurrent administration of the tests disqualified it aspredictive evidence under our criteria.
Additional ResourcesGiven that for many measures our review represents but
one of many, we have provided the final column in Table 2
as a gateway to the research literature on these tests.Although our list is not exhaustive, we have included allreviews that we were able to access via PsycINFO as wellas those that were available through the Buros’s MentalMeasurements Yearbook series and the Tests in Printseries. In addition, we have included reviews fromAthanasiou (2000), Bracken and Naglieri (2003), andMcCallum (2003), as well as those cited within McGrewand Flanagan (1998). Note that a limited number ofpublished reviews are available for the recently developedor revised measures. Readers are referred to the BurosInstitute Mental Measurements Yearbook Test ReviewsOn-Line at http://buros.unl.edu/buros/jsp/search.jsp and theEducational Testing Service’s Test Link database availableat http://www.ets.org/testcoll/index.html for currentinformation.
RecommendationsSelection of an Appropriate Measure
Despite the responsibility incumbent on test developers,“the ultimate responsibility for appropriate test use andinterpretation lies predominately with the test user”(American Educational Research Association et al., 1999,p. 111). We recognize that the administration of IQmeasures is primarily the responsibility of psychologistsrather than speech-language pathologists. However, weprovide suggestions for test selection for two primaryreasons. First, both clinicians and investigators sometimesselect and administer IQ measures within the scope of theirassessment protocols. Second, it is our hope that speech-language professionals will use the information presentedwithin this article to engage in meaningful dialogue withcollaborating psychologists regarding how IQ scoresshould (and should not) be interpreted and which measuresare most appropriate for children with communicationdifficulties. To this end, we offer three considerations toguide the selection of a nonverbal IQ measure. First, themeasure should be psychometrically sound, as previouslydiscussed. Although none of the measures presented inTable 2 are psychometrically ideal, four emerge asparticularly strong: the TONI–3 (Brown et al., 1997),UNIT (Bracken & McCallum, 1998), WASI (Psychologi-cal Corporation, 1999), and WISC–IV (Wechsler, 2003).Although all four measures received negative marks interms of test–retest reliability, all fell close to the .90criterion (although note that the WASI and TONI–3represented information on collapsed age groups only).Similarly, the TONI–3 and UNIT received negative marksunder internal reliability, but their values fell just below the.90 criterion, at .89 for certain age groups. In addition, theUNIT received a negative mark under sample size, but thatwas due specifically to the 16–17-year-old age grouping.
The second consideration for test selection is potentialspecial needs of the population or individual being evalu-ated. Given our particular interest in children with lan-guage difficulties, it is important to consider the linguisticdemands of individual IQ tests. Standard 7.7 (AmericanEducational Research Association et al., 1999) specificallyindicates, “in testing applications where the level of
286 American Journal of Speech-Language Pathology • Vol. 13 • 275–290 • November 2004
linguistic or reading ability is not part of the construct ofinterest, the linguistic or reading demands of the testshould be kept to the minimum necessary for the validassessment of the intended construct” (p. 82). Clearly, thenonverbal instruments listed in Table 1 all attempt toaddress this standard but with differing levels of success. Ifconcerns exist regarding language abilities—particularlyreceptive language skills—test users should select a testthat involves nonverbal instructions only, such as theUNIT (Bracken & McCallum, 1998) or the TONI–3(Brown et al., 1997). Although the Leiter–R (Roid &Miller, 1997) also relies on nonverbal instruction, itspsychometric properties do not appear as favorable as theUNIT and the TONI–3. In terms of special needs, note alsothat the color scheme of the UNIT (green and black) wasspecifically designed to avoid most problems associatedwith color blindness. For children with limited motorcontrol, examiners might want to give special consider-ation to the TONI–3, because it is untimed and does notdepend heavily on manipulatives.
The third factor to consider in selecting a nonverbal IQmeasure is the purpose of assessment. High-stakes assess-ments that will result in diagnoses, treatment eligibility,and/or educational placement should use multidimensionalbatteries, such as the UNIT (Bracken & McCallum, 1998)or the WISC–IV (Wechsler, 2003). In contrast, if one isconducting a low-stakes assessment, such as a screeningfor research purposes, a one-dimensional battery maysuffice, such as the TONI–3 (Brown et al., 1997) or theWASI (Psychological Corporation, 1999). McCallum et al.(2001) describe one-dimensional measures as assessing “anarrow aspect of intelligence through the use of progres-sive matrices,” whereas multidimensional measures “assessmultiple facets of examinees’ intelligence,” such asmemory, reasoning, and attention (pp. 9–10). In sum, givenall three aforementioned considerations, if one is selectinga nonverbal IQ measure for a child with language impair-ment, we recommend the UNIT for cases of high-stakesassessment and the TONI–3 for cases of low-stakesassessment.
Interpretation of Nonverbal IQ MeasuresOnce an IQ measure has been selected and adminis-
tered, one’s focus turns naturally to interpretation. IQscores as a whole are interpreted as estimates of anindividual’s current level of general cognitive functioning.IQ scores tend to correlate positively with academicachievement (r = .50–.75; Anastasi, 1988; Neisser et al.,1996) and have proved to be one of the best predictors ofacademic and job-related outcomes (Neisser et al., 1996;Sattler, 2001). However, given our focus on nonverbal IQmeasures, it seems important to note that nonverbal IQ hasproven to be less effective in predicting academic out-comes than verbal IQ (Sattler, 2001). This difference inpredictive power is potentially due to two explanations,neither of which precludes the other: (a) Verbal subtestshave a higher percentage of variance attributable to generalintelligence, or g, than do nonverbal subtests, and (b)language plays an important role in the standard academic
learning environment. In regard to the latter point, it isdifficult for children with poor language skills in general tosucceed in school regardless of their level of generalcognitive functioning (Records, Tomblin, & Freese, 1992;Tomblin, Zhang, Buckwalter, & Catts, 2000).
Regardless of predictive power, the most accurateinterpretation of an individual’s performance on any givenmeasure comes from an understanding of the specificability or abilities reflected by that measure. As previouslymentioned, factor analytic techniques have given shape to anumber of broad cognitive abilities (see Carroll, 1993;Horn & Cattell, 1966). Each broad cognitive ability can befurther dissected into numerous narrow cognitive abilities,a discussion of which is beyond the scope of this article(see Carroll, 1993; McCallum, 2003; McGrew & Flanagan,1998). McGrew and Flanagan have proposed the Cattell–Horn–Carroll theory, which includes 10 broad cognitiveabilities: crystallized intelligence, fluid intelligence,quantitative knowledge, reading and writing ability, short-term memory, visual processing, auditory processing, long-term storage and retrieval, processing speed, and decision/reaction time.
McGrew and Flanagan (1998) used the broad intelli-gences from the Cattell–Horn–Carroll theory as a frame-work for reviewing a number of IQ measures. Theyconcluded that nonverbal IQ measures as a whole drawheavily on visual processing abilities and fluid intelligence(see also analyses by Johnston, 1982, and Kamhi, Minor,& Mauer, 1990). Fluid intelligence, typified by inductiveand deductive reasoning, refers to “mental operations thatan individual may use when faced with a relatively noveltask that cannot be performed automatically” (McGrew &Flanagan, 1998, p. 14). Select examples of such tasksinclude the Picture Similarities subtest from the DAS (C.D. Elliott, 1990a); the Photo Series subtest from the K–ABC (Kaufman & Kaufman, 1983); the Analogic Reason-ing subtest from the UNIT (Bracken & McCallum, 1998);and the matrices tasks that essentially define the NNAT(Naglieri, 2003); CMMS–3 (Burgemeister et al., 1972);CTONI (Hammill et al., 1997); SPM–P (Raven et al.,1998); and TONI–3 (Brown et al., 1997).
Visual processing is defined by McGrew and Flanagan(1998) as “the ability to generate, perceive, analyze,synthesize, manipulate, transform, and think with visualpatterns and stimuli” (p. 23). Example measures of visualprocessing taken from McGrew and Flanagan include allthe subsets from the K–ABC (Kaufman & Kaufman, 1983)as well as the Block Building, Pattern Construction, andCopying subtests from the DAS (C. D. Elliott, 1990a). Onewould expect similar subtests across nonverbal IQ mea-sures to be associated with visual processing ability aswell, such as the Design Analogies subtest from the Leiter–R (Roid & Miller, 1997) and the Block Design subtestfrom the WASI (Psychological Corporation, 1999). Notethat any single subtest can and likely does reflect morethan one broad cognitive ability. For example, in additionto tapping visual processing skills, it seems likely that theCube Design subtest from the UNIT (Bracken &McCallum, 1998) would be influenced by decision/reaction time. Similarly, the Symbolic Memory and Spatial
DeThorne & Schaefer: Nonverbal IQ 287
Memory subtests from the UNIT are likely to be influ-enced by both visual processing and short-term memory.
Table 3 contains proposed links between broad cogni-tive abilities from the Cattell–Horn–Carroll theory andsubtests from the four measures that emerged from ourreview as psychometrically strong: the TONI–3 (Brown etal., 1997); UNIT (Bracken & McCallum, 1998); WASI(Psychological Corporation, 1999); and WISC–IV(Wechsler, 2003). Such proposed links are based largely onthe work of McCallum (2003) and McGrew and Flanagan(1998), with some additional speculation on our part. Notethat although individual subtests may tap into differentnonverbal abilities, subtests’ often inferior psychometricproperties make interpretation in isolation or in comparisonwith other subtests largely speculative (see McDermott &Glutting, 1997). Consequently, reliability coefficients forthe individual subtests are provided in Table 3 for refer-ence purposes.
The published test reviews cited in Table 2 revealedlittle additional information on the broad cognitiverequirements of individual IQ measures, although wesummarize here what we found for our two recommendedmeasures: the TONI–3 and the UNIT. Both have beencriticized for providing a rather limited assessment of morespecific cognitive abilities. For example, the TONI–3 hasbeen criticized for its one-dimensional nature (Athanasiou,2000; see also Kamhi et al., 1990). Along similar lines,Atlas (2001, p. 1260) noted that the TONI–3’s correlationwith the widely accepted and multidimensional WISC–IIIwas “at best moderate” (r = .53–.63) and based on a smallsample (i.e., 34 students). In contrast, however, DeMauro(2001) attributed this attenuated correlation to the higher
verbal demands of the Weschler scales and, as such,presented it as rather positive evidence of the TONI–3’svalidity. In regard to the UNIT (Bracken & McCallum,1998), Young and Assing (2000) credited its manual with athorough description of the test’s theoretical foundation,but Fives and Flanagan (2002) criticized the measure fornot providing a clear and comprehensive association withmore specific cognitive abilities. As an example, Fives andFlanagan (2002) suggested that an examinee’s low scoreson the Symbolic Memory and Spatial Memory subtestscould be attributed to “limitations in memory, reasoning, orboth of these constructs” (p. 430). In closing, regardless ofwhether one is interpreting individual subtest or compositescores, it is important to bear in mind that valid assessmentof cognition, as with any developmental domain, mustcombine information from multiple sources, includingobservation and interview.
SummaryIn sum, the present article presented information
regarding 16 commonly used child nonverbal IQ measures,highlighted important distinctions among them, andprovided recommendations for test selection and interpre-tation. When selecting a nonverbal IQ measure for childrenwith language difficulties, we currently recommend theUNIT (Bracken & McCallum, 1998) for cases of high-stakes assessment and the TONI–3 (Brown et al., 1997) forcases of low-stakes assessment. Regardless of whichspecific test is administered, professionals are encouragedto be cognizant of each test’s strengths and limitations,including its psychometric properties and its potential
TABLE 3. Broad cognitive abilities and reliability coefficients associated with individual subtests from the recommended IQmeasures.
IQ test Nonverbal subtest Cognitive abilitya r b
Test of Nonverbal Intelligence—Third Abstract Figural Problem-Solving Fluid intelligence .89–94c
Edition (TONI–3; Brown, Sherbenou,& Johnsen, 1997)
Universal Nonverbal Intelligence Test Symbolic Memory Visual processing .80–87(UNIT; Bracken & McCallum, 1998) Short-term memory
Spatial Memory Visual processing .74–.84Short-term memory
Cube Design Visual processing .81–.95Fluid intelligenceDecision/reaction time
Analogic Reasoning Fluid intelligence .70–.83
Wechsler Abbreviated Scale of Intelligence Block Design Visual processing .84–.93(WASI; Psychological Corporation, 1999) Fluid intelligence
Decision/reaction timeMatrix Reasoning Fluid intelligence .86–.96
Decision/reaction time
Wechsler Intelligence Scale for Children— Block Design Visual processing .83–.88Fourth Edition (WISC–IV; Fluid intelligenceWechsler, 2003) Decision/reaction time
Picture Concepts Fluid intelligence .76–.85Matrix Reasoning Visual processing .86–.92
aBroad cognitive abilities from the Cattell–Horn–Carroll theory, as proposed by McGrew and Flanagan (1998). bRange of subtest reliabilitycoefficients across ages for that particular subtest. cSubtest reliability for the TONI–3 is the same as for the entire test.
288 American Journal of Speech-Language Pathology • Vol. 13 • 275–290 • November 2004
relation to different cognitive abilities. We hope that thisguide of nonverbal intelligence measures will help clini-cians and investigators to better interpret available IQscores and, when applicable, select the most appropriateinstrument for their specific needs.
AcknowledgmentsWe wish to thank John McCarthy, Joanna Burton, Patricia
Ruby, and Colleen Curran for their assistance with this project.
ReferencesAmerican Educational Research Association, American
Psychological Association, and National Council onMeasurement in Education. (1999). Standards for educa-tional and psychological testing. Washington, DC: AmericanPsychological Association.
Anastasi, A. (1985). Review of the Kaufman Assessment Batteryfor Children. In J. V. Mitchell (Ed.), Mental measurementsyearbook (9th ed., pp. 769–771). Lincoln, NE: Buros Instituteof Mental Measurements.
Anastasi, A. (1988). Psychological testing (6th ed.). New York:Macmillan.
Athanasiou, M. S. (2000). Current nonverbal assessmentinstruments: A comparison of psychometric integrity and testfairness. Journal of Psychoeducational Assessment, 18,211–229.
Athanasiou, M. (2003). Review of the Pictorial Test ofIntelligence, Second edition. In B. S. Plake, J. C. Impara, &R. A. Spies (Eds.), Mental measurements yearbook (15th ed.,pp. 676–678). Lincoln, NE: Buros Institute of MentalMeasurements.
Atlas, J. A. (2001). Review of the Test of Nonverbal Intelligence,Third edition. In B. S. Plake, & J. C. Impara (Eds.), Mentalmeasurements yearbook (14th ed., pp. 1259–1260). Lincoln,NE: Buros Institute of Mental Measurements.
Aylward, G. P. (1992). Review of the Differential Ability Scales.In J. J. Kramer & J. C. Conoley (Eds.), Mental measurementsyearbook (11th ed., pp. 281–282). Lincoln, NE: BurosInstitute of Mental Measurements.
Aylward, G. P. (1998). Review of the Comprehensive Test ofNonverbal Intelligence. In J. C. Impara & B. S. Plake (Eds.),The mental measurements yearbook (13th ed., pp. 310–312).Lincoln, NE: Buros Institute of Mental Measurements.
Bandalos, D. L. (2001). Review of the Universal NonverbalIntelligence Test. In B. S. Plake & J. C. Impara (Eds.), Mentalmeasurements yearbook (14th ed., pp. 1296–1298). Lincoln,NE: Buros Institute of Mental Measurements.
Bortner, M. (1965). Review of the Raven Standard ProgressiveMatrices. In O. K. Buros (Ed.), Mental measurementsyearbook (6th ed., pp. 489–491). Highland Park, NJ: GryphonPress.
Bracken, B. A. (1985). A critical review of the Kaufman Batteryfor Children (K–ABC). School Psychology Review, 14, 21–36.
Bracken, B. A. (1987). Limitations of preschool instruments andstandards for minimal levels of technical adequacy. Journal ofPsychoeducational Assessment, 4, 313–326.
Bracken, B. A. (1988). Ten psychometric reasons why similartests produce dissimilar results. Journal of School Psychology,26, 155–166.
Bracken, B. A. (Ed.). (2000). The psychoeducational assessmentof preschool children (3rd ed.). Boston: Allyn & Bacon.
Bracken, B. A., & McCallum, R. S. (1998). Universal Nonver-bal Intelligence Test. Itasca, IL: Riverside Publishing.
Bracken, B. A., & Naglieri, J. A. (2003). Assessing diversepopulations with nonverbal tests of general intelligence. In C.R. Reynolds & R. W. Kamphaus (Eds.), Handbook ofpsychological and educational assessment of children:Intelligence, aptitude, and achievement (2nd ed., pp. 243–274). New York: Guilford Press.
Bradley-Johnson, S. (1997). Review of the Comprehensive Testof Nonverbal Intelligence. Psychology in the Schools, 34,289–292.
Brown, L., Sherbenou, R. J., & Johnsen, S. K. (1997). Test ofNonverbal Intelligence—Third Edition. Austin, TX: Pro-Ed.
Burgemeister, B. B., Hollander, L. H., & Lorge, I. (1972).Columbia Mental Maturity Scale—Third Edition. SanAntonio, TX: Psychological Corporation.
Carroll, J. B. (1993). Human cognitive abilities: A survey offactor-analytic studies. Cambridge, England: CambridgeUniversity Press.
Casby, M. W. (1992). The cognitive hypothesis and its influenceon speech–language services in schools. Language, Speech,and Hearing Services in Schools, 23, 198–202.
Castellanos, M., Kline, R. B., & Snyder, J. (1996). Lessonsfrom the Kaufman Assessment Battery for Children (K–ABC): Toward a new cognitive assessment model. Psycho-logical Assessment, 8, 7–17.
Coffman, W. E. (1985). Review of the Kaufman AssessmentBattery for Children. In J. V. Mitchell (Ed.), Mental measure-ments yearbook (9th ed., pp. 771–773). Lincoln, NE: BurosInstitute of Mental Measurements.
Conoley, J. C. (1990). Review of the K–ABC: Reflecting theunobservable. Journal of Psychoeducational Assessment, 8,369–375.
DeMauro, G. E. (2001). Review of the Test of NonverbalIntelligence, Third edition. In B. S. Plake & J. C. Impara(Eds.), Mental measurements yearbook (14th ed., pp. 1260–1262). Lincoln, NE: Buros Institute of Mental Measurements.
Drossman, E. R., & Maller, S. J. (2000). Review of theComprehensive Test of Nonverbal Intelligence. Journal ofPsychoeducational Assessment, 18, 293–301.
Egeland, B. R. (1978). Review of the Columbia Mental MaturityScale. In O. K. Buros (Ed.), Mental measurements yearbook(8th ed., pp. 298–299). Highland Park, NJ: Gryphon Press.
Elliott, C. D. (1990a). Differential Ability Scales. San Antonio,TX: Psychological Corporation.
Elliott, C. D. (1990b). The nature and structure of children’sabilities: Evidence from the Differential Ability Scales.Journal of Psychoeducational Assessment, 8, 376–390.
Elliott, S. N. (1990). The nature and structure of the DAS:Questioning the test’s organizing model and use. Journal ofPsychoeducational Assessment, 8, 406–411.
Farrell, M., & Phelps, L. (2000). A comparison of the Leiter–Rand the Universal Nonverbal Intelligence Test (UNIT) withchildren classified as language impaired. Journal of Psycho-educational Assessment, 18, 268–274.
Fives, C. J., & Flanagan, R. (2002). A review of the UniversalNonverbal Intelligence Test (UNIT): An advance for evaluat-ing youngsters with diverse needs. School PsychologyInternational, 23, 425–449.
Flanagan, D., & Alfonso, V. (1995). A critical review of thetechnical characteristics of new and recently revised intelli-gence tests for preschool children. Journal of Pyscho-educational Assessment, 13, 66–90.
Flanagan, D. P., & Caltabiano, L. F. (2003). Review of thePictorial Test of Intelligence, Second edition. In B. S. Plake, J.C. Impara, & R. A. Spies (Eds.), Mental measurementsyearbook (15th ed., pp. 678–681). Lincoln, NE: BurosInstitute of Mental Measurements.
DeThorne & Schaefer: Nonverbal IQ 289
Flynn, J. R. (1999). Searching for justice: The discovery of IQgains over time. American Psychologist, 54, 5–20.
French, J. L. (2001). Pictorial Test of Intelligence—SecondEdition. Austin, TX: Pro-Ed.
Glutting, J., Adams, W., & Sheslow, D. (2000). Wide RangeIntelligence Test. Wilmington, DE: Wide Range.
Hamilton, W., & Burns, T. G. (2003). WPPSI–III: WechslerPreschool and Primary Scale for Intelligence (3rd ed.).Applied Neuropsychology, 10, 188–190.
Hammill, D. D., Pearson, N. A., & Wiederholt, J. E. (1997).Comprehensive Test of Nonverbal Intelligence. Austin, TX:Pro-Ed.
Hessler, G. (1985). Review of the Kaufman Assessment Batteryfor Children: Implications for assessment of the gifted.Journal for the Education of the Gifted, 8, 133–147.
Hopkins, K. D., & Hodge, S. E. (1984). Review of the KaufmanAssessment Battery (K–ABC) for Children. Journal ofCounseling and Development, 63, 105–107.
Horn, J. L., & Cattell, R. B. (1966). Refinement and test of thetheory of fluid and crystallized general intelligences. Journalof Educational Psychology, 57, 253–270.
Johnston, J. R. (1982). Interpreting the Leiter IQ: Performanceprofiles of young normal and language-disordered children.Journal of Speech and Hearing Research, 25, 291–296.
Kamhi, A. G., Minor, J. S., & Mauer, D. (1990). Contentanalysis and intratest performance profiles on the Columbiaand the TONI. Journal of Speech and Hearing Research, 33,375–379.
Kamphaus, R. W. (1990). K–ABC theory in historical andcurrent contexts. Journal of Pyschoeducational Assessment, 8,356–368.
Kamphaus, R. W., & Reynolds, C. R. (1984). Development andstructure of the Kaufman Assessment Battery for Children (K–ABC). Journal of Special Education, 18, 213–228.
Kaufman, A. S. (1978). Review of the Columbia MentalMaturity Scale. In O. K. Buros (Ed.), Mental measurementsyearbook (8th ed., pp. 299–301). Highland Park, NJ: GryphonPress.
Kaufman, A. S., & Kaufman, N. L. (1983). K–ABC: KaufmanAssessment Battery for Children. Circle Pines, MN: AmericanGuidance Service.
Keith, T. Z. (1985). Questioning the K–ABC: What does itmeasure? School Psychology Review, 14, 9–20.
Keith, T. Z. (2001). Review of the Wechsler Abbreviated Scaleof Intelligence. In B. S. Plake & J. C. Impara (Eds.), Mentalmeasurements yearbook (14th ed., pp. 1329–1331). Lincoln,NE: Buros Institute of Mental Measurements.
Linkskog, C. O., & Smith, J. V. (2001). Review of the WechslerAbbreviated Scale of Intelligence. In B. S. Plake & J. C.Impara (Eds.), Mental measurements yearbook (14th ed., pp.1331–1332). Lincoln, NE: Buros Institute of MentalMeasurements.
Marco, G. L. (2001). Review of the Leiter InternationalPerformance Scale—Revised. In B. S. Plake & J. C. Impara(Eds.), Mental measurements yearbook (14th ed., pp. 683–687). Lincoln, NE: Buros Institute of Mental Measurements.
McCallum, R. S. (2003). Handbook of nonverbal assessment.New York: Kluwer Academic/Plenum.
McCallum, R. S., Bracken, B., & Wasserman, J. (2001).Essentials of nonverbal assessment. New York: Wiley.
McCauley, R. J. (2001). Assessment of language disorders inchildren. Mahwah, NJ: Erlbaum.
McCauley, R. J., & Swisher, L. (1984a). Psychometric reviewof language and articulation tests for preschool children.Journal of Speech and Hearing Disorders, 49, 34–42.
McCauley, R. J., & Swisher, L. (1984b) Use and misuse of
norm-referenced tests in clinical assessment: A hypotheticalcase. Journal of Speech and Hearing Disorders, 49, 338–348.
McDermott, P. A., & Glutting, J. J. (1997). Informing stylisticlearning behavior, disposition, and achievement throughability subtests—Or, more illusions of meaning? SchoolPsychology Review, 26, 163–175.
McFadden, T. U. (1996). Creating language impairments intypically achieving children: The pitfalls of “normal”normative sampling. Language, Speech, and Hearing Servicesin Schools, 27, 3–9.
McGrew, K. S., & Flanagan, D. P. (1998). The intelligence testdesk reference: Gf–Gc cross-battery assessment. Boston:Allyn & Bacon.
Mehrens, W. A. (1984). A critical analysis of the psychometricproperties of the K–ABC. Journal of Special Education, 18,297–310.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educationalmeasurement (3rd ed., pp. 13–109). New York: AmericanCouncil on Education and Macmillan.
Naglieri, J. A. (2003). Naglieri Nonverbal Ability Test. SanAntonio, TX: Psychological Corporation.
Naglieri, J. A., & Das, J. P. (1997). Cognitive AssessmentSystem. Itasca, IL: Riverside.
Neisser, U., Boodoo, G., Bouchard, T. J., Boykin, A. W.,Brody, N., Ceci, S. J., et al. (1996). Intelligence: Knowns andunknowns. American Psychologist, 51, 77–101.
Page, E. B. (1985). Review of the Kaufman Assessment Batteryfor Children. In J. V. Mitchell (Ed.), Mental measurementsyearbook (9th ed., pp. 773–777). Lincoln, NE: Buros Instituteof Mental Measurements.
Plante, E. (1998). Criteria for SLI: The Stark and Tallal legacyand beyond. Journal of Speech, Language, and HearingResearch, 41, 951–957.
Platt, L. O., Kamphaus, R. W., Keltgen, J., & Gilliland, F.(1991). An overview and review of the Differential AbilityScales: Initial and current research findings. Journal of SchoolPsychology, 29, 271–277.
Psychological Corporation. (1999). Wechsler Abbreviated Scaleof Intelligence. San Antonio, TX: Author.
Raven, J., Raven, J. C., & Court, J. H. (1998). StandardProgressive Matrices–Parallel form. Oxford, England: OxfordPsychologists Press.
Records, N. L., Tomblin, J. B., & Freese, P. R. (1992). Thequality of life of young adults with histories of specificlanguage impairment. American Journal of Speech-LanguagePathology, 1, 44–53.
Reinehr, R. C. (1992). Review of the Differential Ability Scales.In J. J. Kramer & J. C. Conoley (Eds.), Mental measurementsyearbook (11th ed., pp. 282–283). Lincoln, NE: BurosInstitute of Mental Measurements.
Reynolds, C. R., & Kamphaus, R. W. (2003). ReynoldsIntellectual Assessment Scales. Lutz, FL: PsychologicalAssessment Resources.
Roid, G. H. (2003). Stanford–Binet Intelligence Scale—FifthEdition. Itasca, IL: Riverside.
Roid, G. H., & Miller, L. J. (1997). Leiter InternationalPerformance Scale—Revised. Wood Dale, IL: Stoelting.
Salvia, J., & Ysseldyke, J. E. (1998). Assessment (7th ed.).Boston: Houghton Mifflin.
Sattler, J. M. (2001). Assessment of children: Cognitiveapplications (4th ed.). San Diego, CA: Author.
Stark, R. E., & Tallal, P. (1981). Selection of children withspecific language deficits. Journal of Speech and HearingDisorders, 46, 114–122.
Sternberg, R. J. (1983). Should K come before A, B, and C? Areview of the Kaufman Assessment Battery for Children
290 American Journal of Speech-Language Pathology • Vol. 13 • 275–290 • November 2004
(K–ABC). Contemporary Education Review, 2, 199–207.Sternberg, R. J. (1984). The Kaufman Assessment Battery for
Children: An information-processing analysis and critique.Journal of Special Education, 18, 269–279.
Stinnett, T. A. (2001). Review of the Leiter InternationalPerformance Scale–Revised. In B. S. Plake & J. C. Impara(Eds.), Mental measurements yearbook (14th ed., pp. 687–692). Lincoln, NE: Buros Institute of Mental Measurements.
Stinnett, T. A. (2003). Review of the Wide Range IntelligenceTest. In B. S. Plake, J. C. Impara, & R. A. Spies (Eds.),Mental measurements yearbook (15th ed., pp. 1011–1015).Lincoln, NE: Buros Institute of Mental Measurements.
Tomblin, J. B., Zhang, X., Buckwalter, P., & Catts, H. (2000).The association of reading disability, behavioral disorders, andlanguage impairment among second-grade children. Journalof Child Psychology and Psychiatry and Allied Disciplines,41, 473–482.
U.S. Bureau of the Census. (2001). Census 2000 Summary File1—United States [Machine-readable computer file]. Washing-ton, DC: Author.
Van Lingen, G. (1998). Review of the Comprehensive Test ofNonverbal Intelligence. In J. C. Impara & B. S. Plake (Eds.),The mental measurements yearbook (13th ed., pp. 312–314).Lincoln, NE: Buros Institute of Mental Measurements.
Wechsler, D. (2002). Wechsler Preschool and Primary Intelli-gence Scale—Third Edition. San Antonio, TX: PsychologicalCorporation.
Wechsler, D. (2003). Wechsler Intelligence Scale for Children—Fourth Edition. San Antonio, TX: Psychological Corporation.
Whitmire, K. (2000). Cognitive referencing and discrepancyformulae: Comments from ASHA resources. ASHA SpecialInterest Division 1, Language Learning and Education, 7,13–16.
Widaman, K. F. (2003). Review of the Wide Range IntelligenceTest. In B. S. Plake, J. C. Impara, & R. A. Spies (Eds.),Mental measurements yearbook (15th ed., pp. 1015–1017).Lincoln, NE: Buros Institute of Mental Measurements.
Woodcock, R. W., McGrew, K. S., & Mather, N. (2001).Woodcock–Johnson Tests of Cognitive Abilities—III. Itasca,IL: Riverside.
Young, E. L., & Assing, R. (2000). Review of the UniversalNonverbal Intelligence Test. Journal of PsychoeducationalAssessment, 18, 280–288.
Received April 5, 2004Revision received June 30, 2004Accepted October 7, 2004DOI: 10.1044/1058-0360(2004/029)
Contact author: Laura S. DeThorne, PhD, Pennsylvania StateUniversity, 110 Moore Building, University Park, PA 16802.E-mail: [email protected]