Interpreting Intelligence Tests from Contemporary Gf-Gc Theory

32
Journal of School Psychology, Vol. 36, No. 2, pp. 151–182, 1998 Copyright 1998 Society for the Study of School Psychology Printed in the USA 0022-4405/98 $19.00 1 .00 PII S0022-4405(98)00003-X Interpreting Intelligence Tests from Contemporary Gf-Gc Theory: Joint Confirmatory Factor Analysis of the WJ-R and KAIT in a Non-White Sample Dawn P. Flanagan St. John’s University Kevin S. McGrew St. Cloud State University In the present study, the correlations of test scores between the Woodcock-Johnson- Revised (WJ-R) and the Kaufman Adolescent and Adult Intelligence Test (KAIT) were factor analyzed in order to test the replicability of the contemporary Horn- Cattell Gf-Gc model in a non-White sample and to gain a more complete under- standing of the factorial structure of the KAIT. The empirically supported Gf-Gc theoretical model underlying the WJ-R was used as the criterion against which to evaluate the cognitive abilities that are measured by the KAIT. Participants were 114 6th-, 7th-, and 8th-grade students ranging in age from 10 years, 11 months to 15 years, 11 months. Confirmatory factor analyses were used to evaluate and com- pare eight a priori factor models and one post-hoc factor model. A Gf-Gc nine-factor model was the most plausible a priori model fit of the WJ-R/KAIT data, a finding that extends the replicability of the Gf-Gc model to a non-White sample. The facto- rial structure of the KAIT put forward by its authors (i.e., a two-factor Gf-Gc model) was not supported. It appears that the KAIT measures Glr or long-term retrieval (associative memory) and Gsm or short-term memory (memory span) in addition to fluid and crystallized abilities. These results provide support for use of the Gf- Gc theory in a non-White sample and interpreting the KAIT from contemporary Gf-Gc theory rather than a two-factor model. 1998 Society for the Study of School Psychology. Published by Elsevier Science Ltd Keywords: Gf-Gc theory, Intelligence, Multicultural, KAIT, WJ-R. This past decade has seen a marked increase in instruments and techniques for assessing intellectual functioning, such as the Kaufman Adolescent and Adult Intelligence Test (KAIT; Kaufman & Kaufman, 1993) and the Wood- cock-Johnson-Revised, Tests of Cognitive Ability (WJ-R COG; Woodcock & Johnson, 1989), as well as new and revised theories of intelligence (see Flanagan, Genshaft, & Harrison, 1997, for an overview). Despite this recent Received May 8, 1995; accepted September 19, 1996. Address correspondence and reprint requests to Dawn P. Flanagan, St. John’s University, Department of Psychology, 8000 Utopia Parkway, Jamaica, NY 11439. E-mail: flanagad@ stjohns.edu 151

Transcript of Interpreting Intelligence Tests from Contemporary Gf-Gc Theory

Journal of School Psychology, Vol. 36, No. 2, pp. 151–182, 1998Copyright 1998 Society for the Study of School Psychology

Printed in the USA0022-4405/98 $19.00 1 .00

PII S0022-4405(98)00003-X

Interpreting Intelligence Tests from ContemporaryGf-Gc Theory: Joint Confirmatory Factor Analysis

of the WJ-R and KAIT in a Non-White Sample

Dawn P. FlanaganSt. John’s University

Kevin S. McGrewSt. Cloud State University

In the present study, the correlations of test scores between the Woodcock-Johnson-Revised (WJ-R) and the Kaufman Adolescent and Adult Intelligence Test (KAIT)were factor analyzed in order to test the replicability of the contemporary Horn-Cattell Gf-Gc model in a non-White sample and to gain a more complete under-standing of the factorial structure of the KAIT. The empirically supported Gf-Gctheoretical model underlying the WJ-R was used as the criterion against which toevaluate the cognitive abilities that are measured by the KAIT. Participants were114 6th-, 7th-, and 8th-grade students ranging in age from 10 years, 11 months to15 years, 11 months. Confirmatory factor analyses were used to evaluate and com-pare eight a priori factor models and one post-hoc factor model. A Gf-Gc nine-factormodel was the most plausible a priori model fit of the WJ-R/KAIT data, a findingthat extends the replicability of the Gf-Gc model to a non-White sample. The facto-rial structure of the KAIT put forward by its authors (i.e., a two-factor Gf-Gc model)was not supported. It appears that the KAIT measures Glr or long-term retrieval(associative memory) and Gsm or short-term memory (memory span) in additionto fluid and crystallized abilities. These results provide support for use of the Gf-Gc theory in a non-White sample and interpreting the KAIT from contemporaryGf-Gc theory rather than a two-factor model. 1998 Society for the Study of SchoolPsychology. Published by Elsevier Science Ltd

Keywords: Gf-Gc theory, Intelligence, Multicultural, KAIT, WJ-R.

This past decade has seen a marked increase in instruments and techniquesfor assessing intellectual functioning, such as the Kaufman Adolescent andAdult Intelligence Test (KAIT; Kaufman & Kaufman, 1993) and the Wood-cock-Johnson-Revised, Tests of Cognitive Ability (WJ-R COG; Woodcock &Johnson, 1989), as well as new and revised theories of intelligence (seeFlanagan, Genshaft, & Harrison, 1997, for an overview). Despite this recent

Received May 8, 1995; accepted September 19, 1996.Address correspondence and reprint requests to Dawn P. Flanagan, St. John’s University,

Department of Psychology, 8000 Utopia Parkway, Jamaica, NY 11439. E-mail: [email protected]

151

152 Journal of School Psychology

surge of intelligence batteries, practitioners rely predominantly on theWechsler Scales (Wechsler, 1981, 1989, 1991) to assess cognitive function-ing (Harrison, Kaufman, Hickman, & Kaufman, 1988; Stinnett, Havey, &Oehler-Stinnett, 1994; Wilson & Reschly, 1996). Therefore, most prac-titioners use intelligence batteries that conceptualize intelligence as a gen-eral ability (i.e., a full scale IQ ) under which either dichotomous abilities(i.e., verbal and performance) or four different abilities (i.e., verbal com-prehension, perceptual organization, freedom from distractibility, and pro-cessing speed) are subsumed.

Although much progress has occurred in the development and revisionof theories of intelligence in recent years, the psychometric approach is themost widely used and empirically supported (Neisser et al., 1996). Withinthe psychometric approach, differences of opinion exist on the conceptual-ization of intelligence as a single (i.e., g) or multidimensional construct.For example, using the norm data of the Wechsler Intelligence Scale forChildren-Third Edition (WISC-III; Wechsler, 1991), Macmann and Bar-nett’s (1994) analyses supported a single g model while Keith’s (1994) anal-yses supported a hierarchical model (i.e., four first-order factors and a sec-ond-order g factor). The differences between the Macmann-Barnett andKeith WISC-III models reflects a lack of consensus on the structure of intel-ligence, even when the same data are factor analyzed by different research-ers. In addition, when the Wechsler subtests were joint factor analyzed witha broad range of cognitive ability tests, more than four factors were found tounderlie this instrument (Woodcock, 1990), suggesting that psychometricconceptualizations of intelligence need to be based on studies that includea broader range of tests than those comprising a single test battery (seealso Carroll, 1993a). These type of studies are best conceived from existingpsychometric theories of intelligence that are supported by structural evi-dence as well as nonfactor analytic evidence.

Reviews of the extant factor analytic research over the past 50 to 60 yearson intelligence (e.g., Carroll, 1983, 1989, 1993a, 1997; Gustafson, 1984,1988; Horn, 1988, 1991, 1994; Lohman, 1989; Snow, 1986) suggest that thescientific evidence does not support either a single general intelligencemodel or dichotomous models (e.g., verbal/nonverbal, fluid/crystallized,simultaneous/sequential). These reviews have tended to converge on amodel of multiple human cognitive abilities. In particular, Horn and Cat-tell’s Gf-Gc theory (Cattell, 1941; Horn, 1985, 1988, 1991, 1994) and Car-roll’s (1993a, 1997) hierarchical three-stratum theory are two of the mostprominent empirically derived theories of multiple cognitive abilities to date.

Building on the work of Cattell, Horn’s program of Gf-Gc research hasidentified nine broad abilities including Fluid Reasoning (Gf ), Accultura-tion Knowledge or Crystallized Intelligence (Gc), Short-Term Apprehen-sion-Retention (Gsm), Visual Processing (Gv), Auditory Processing (Ga),Fluency of Retrieval from Long-Term Storage (Glr), Processing Speed

Flanagan and McGrew 153

(Gs), Correct Decision Speed (CDS ), and Quantitative Knowledge (Gq)(Horn, 1994). In addition to these abilities, Woodcock (1993, in press) hasprovided preliminary support for a broad Reading-Writing factor (Grw).Similarly, Carroll’s (1993a) seminal work has identified eight broad cogni-tive ability factors (i.e., Fluid Intelligence, Crystallized Intelligence, GeneralMemory and Learning, Broad Visual Perception, Broad Auditory Percep-tion, Broad Retrieval Ability, Broad Cognitive Speediness, ProcessingSpeed/Reaction Time Decision Speed). The most salient differences be-tween the theoretical frameworks of Carroll and Horn are that Carroll’smodel includes a higher-order general intelligence factor (g) (referred toas a stratum III ability) and does not specify a quantitative ability factor (Gq).Both Horn’s and Carroll’s models include a set of correlated second-orderbroad cognitive factors that Carroll refers to as stratum II abilities whichsubsume a set of correlated first-order factors that are referred to as narrow,stratum I abilities. For example, the broad Gc factor represents an individu-al’s range and depth of knowledge of a culture, including the ability tocommunicate and reason with previously learned procedures. The narrowabilities that comprise this factor may include tests of language develop-ment, lexical knowledge, listening ability and general information to namea few. Despite the differences between Horn’s and Carroll’s models, theGf-Gc factors identified by these scholars are very similar. Ten Gf-Gc broadcognitive abilities are described in Table 1.

In addition to this factor analytic (structural) evidence for the Gf-Gc con-ception of intelligence, support for this model of multiple abilities is foundalso in the form of (a) developmental evidence—changes in abilities acrossage; (b) neurocognitive evidence—relationship to indicators of physiologi-cal and neurological functioning; (c) achievement evidence—predictionsof academic performance and occupational achievement; and (d) heritabil-ity evidence—relationships among persons who are related biologically todiffering degrees (see Carroll, 1993a; Horn, 1994; Horn & Noll, 1997, fora review). Moreover, studies have shown that the Gf-Gc structure of in-telligence is invariant across the lifespan (e.g., Bickley, Keith, & Wolfle,1995) and across ethnic and cultural groups (e.g., Carroll, 1993a), al-though a limited number of studies have been conducted with regard tothe latter.

In general, Gf-Gc models are based on a more thorough network of valid-ity evidence than other contemporary multidimensional ability models ofintelligence. In a review and comparison of Gf-Gc theory, Gardner’s Theoryof Multiple Intelligences and Sternberg’s Triarchic Theory, Messick (1992)criticized the latter two theories for their selective attention to certain formsof validity evidence and the offering of new sets of rules for evaluating thisevidence. Messick concluded that of the contemporary multiple humancognitive ability theories, Gf-Gc theory has the strongest network of validityevidence, and, should be used as a framework from which to evaluate the

154 Journal of School Psychology

Table 1Ten Gf-Gc Broad Cognitive Factor Definitions

Gf-Gc Factor Gf-Gc Symbol Definition

Fluid Reasoning Gf Ability to reason, form concepts, and problemsolve, using novel information and/or pro-cedures

Crystallized Intelligence Gc Measures an individual’s breadth and depth ofgeneral knowledge and knowledge of a cul-ture, including verbal communication andreasoning with previously learned proce-dures

Visual Processing Gv Ability to analyze and synthesize visual infor-mation

Auditory Processing Ga Ability to analyze and synthesize auditory infor-mation

Processing Speed Gs Ability to quickly perform automatic cognitivetasks, particularly when under pressure tomaintain focused concentration

Short-Term Memory Gsm Ability to temporarily hold information inimmediate awareness and then use it withina few seconds

Long-Term Retrieval Glr Ability to store information and retrieve itlater through association

Quantitative Knowledge Gq Ability to comprehend quantitative conceptsand relationships and to manipulate numeri-cal symbols

Correct Decision Speed CDS Quickness in providing correct answers to avariety of moderately difficult problems incomprehension, reasoning, and problemsolving

Reading-Writing Grw Ability to read and write, including facility inbasic reading and writing skills and theskills necessary for comprehension/expres-sion (currently not well defined in the litera-ture)

theories of Gardner and Sternberg. Given the breadth of empirical supportfor the Gf-Gc structure of intelligence, models based on Gf-Gc theory (i.e.,the Horn-Cattell and Carroll models) appear to provide some of the mostuseful frameworks for designing and evaluating intelligence tests (Carroll,1997; Flanagan & McGrew, 1995a, 1997; McGrew, 1997; Woodcock, 1990).

IMPORTANCE OF JOINT FACTOR ANALYSES

The construct validity of several major intelligence test batteries has beenquestioned recently by researchers (see Flanagan & McGrew, 1995b;McGrew & Flanagan, 1998 for a summary). One ‘‘problem’’ with the con-struct validity evidence typically reported in intelligence test manuals isthat the factor analytic evidence usually is based on only the subtests thatare included in the test battery (e.g., the factor analyses reported in the

Flanagan and McGrew 155

WISC-III manual are restricted to the 13 WISC-III subtests). The inclusionof only the subtests within a test battery in factor analytic studies likely re-sults in an insufficient breadth of cognitive measures (or markers) to allowfor a complete understanding of the general, broad, and narrow abilitiesthat are measured by the battery (Carroll, 1993b; Keith & Witta, 1997;Woodcock, 1990). If factor analytic studies are to be used to identify thecognitive constructs measured by a test battery, then studies should be de-signed to represent the full range of known human cognitive abilities (e.g.,eight or nine broad Gf-Gc abilities identified by Carroll and Horn). In addi-tion, these studies should include at least two or three relatively pure mark-ers (i.e., strong or moderate measures but not mixed measures) for eachof the abilities represented so that the abilities measured by the individualcognitive tests can be identified clearly (Woodcock, 1990).

Among our major intelligence test batteries, the Woodcock-JohnsonPsycho-Educational Battery-Revised (WJ-R; Woodcock & Johnson, 1989)appears to measure the greatest range of human cognitive abilities. Aneight-factor Gf-Gc model provided the best overall fit of the WJ-R data whencompared to a single factor (first-order g) model, a Verbal/Nonverbalmodel, and a hierarchical g model (McGrew, Werder, & Woodcock, 1991).The results of these analyses revealed that 16 of the WJ-R tests can be usedas relatively pure markers for eight cognitive abilities (i.e., Gf, Gc, Gq, Gsm,Glr, Ga, Gv, Gs) in joint factor analyses with other intelligence test batteriesto infer common Gf-Gc factor concepts across test batteries.

Woodcock (1990) and McGhee (1993) used the Gf-Gc model underlyingthe WJ-R as a validated framework against which to evaluate the cognitiveabilities measured by most of the major intelligence test batteries. Usingthe 16 WJ-R tests as markers for eight Gf-Gc factors, Woodcock (1990) con-ducted a series of joint confirmatory factor analyses of the Kaufman As-sessment Battery for Children (K-ABC; Kaufman & Kaufman, 1983), theStanford-Binet Intelligence Scale: Fourth Edition (SB:FE; Thorndike, Ha-gen, & Sattler, 1986) and the Wechsler Scales, while McGhee (1993) con-ducted confirmatory factor analyses of the DAS and Detroit Tests of Learn-ing Aptitude-3 (DTLA-3; Hammill & Bryant, 1991). When a criterion oftwo or three markers is used to identify Gf-Gc factors, it is evident fromthese analyses that most intelligence batteries include only 3 or 4 Gf-Gcfactors that can be interpreted confidently (see Table 2). Also, some intelli-gence test batteries contain only one relatively pure measure of one ormore Gf-Gc factors (e.g., Digit Span is the only indicator of Gsm in theWechsler Scales), a situation that limits generalizations about an individu-al’s performance in a Gf-Gc cognitive domain without additional support-ing data.

Inspection of Table 2 also shows that, with the exception of the WJ-R,intelligence test batteries have few or no tests that measure Long-TermRetrieval (Glr), Processing Speed (Gs), Auditory Processing (Ga), and

Tab

le2

Gf-G

cFa

ctor

Cla

ssifi

cati

ons

ofFi

veM

ajor

Inte

llige

nce

Tes

tB

atte

ries

Gf-G

cFa

ctor

WJ-R

Wec

hsl

er’s

SB:I

VD

ASb

K-A

BC

b

Lon

g-T

erm

Ret

riev

al(G

lr)

Mem

ory

for

Nam

es—

——

—V

is-A

udL

earn

ing

Del

ayed

Rec

all-M

ND

elay

edR

ecal

l-VA

L

Shor

t-Ter

mM

emor

y(G

sm)

Mem

ory

for

Wor

dsD

igit

Span

Mem

ory

for

Dig

its

Rec

all

ofD

esig

nsN

umbe

rR

ecal

lM

emor

yfo

rSe

nt.

Mem

ory

for

Obj

ects

Wor

dO

rder

(Gca )

Mem

.fo

rSe

nt.

(Gca )

Har

dM

ove

(Gq

a )N

umbe

rsR

ever

sed

Bea

dM

emor

y(G

va )(G

fa )

Proc

essi

ng

Spee

d(G

s)V

isua

lM

atch

ing

Cod

ing/

Dig

it—

——

Cro

ssO

utSy

mbo

lSy

mbo

lSe

arch

Aud

itor

yPr

oces

sin

g(G

a)In

com

plet

eW

ords

——

——

Soun

dB

lend

ing

Soun

dPa

tter

ns

(Gfa )

Vis

ual

Proc

essi

ng

(Gv)

Vis

ual

Clo

sure

Blo

ckD

esig

nP

atte

rnA

naly

sis

Patt

ern

Con

stru

ctio

nT

rian

gles

Pict

ure

Rec

ogn

itio

nO

bjec

tA

ssem

bly

Cop

yin

gG

esta

ltC

losu

reSp

atia

lR

elat

ion

s(G

fa )M

azes

Pape

rFo

ld.

(Gq

a )Sp

atia

lM

emPi

ctur

eC

omp

(Gca )

Mat

.A

nal

og.

(Gfa )

Pict

ure

Arr

ang

Phot

oSe

ries

(Gfa )

(Gca )

156

Com

preh

ensi

on-

Pic

ture

Voc

abul

ary

Info

rmat

ion

Voc

abul

ary

Wor

dD

efini

tion

sFa

ces

&P

lace

sK

now

ledg

e(G

c)O

ral

Voc

abul

ary

Sim

ilari

ties

Ver

bal

Rel

atio

nsSi

mila

riti

esR

iddl

esL

iste

nin

gC

omp.

Voc

abul

ary

Com

preh

ensi

onE

xpre

ssiv

eV

oc.

(Gsm

a )C

ompr

ehen

sion

Abs

urdi

ties

Flui

dR

easo

nin

g(G

f)A

naly

sis-

Synt

hesi

s—

Mat

rice

sM

atri

ces

—C

once

ptFo

rmat

ion

Seq-

Qua

nt

Rea

son

Ver

bal

An

alog

ies

(Gq

a )(G

ca )

Qua

nti

tati

veA

bilit

y(G

q)C

alcu

lati

onA

rith

met

icQ

uant

itat

ive

—A

rith

met

icA

pplie

dP

robl

ems

Num

ber

Seri

esE

quat

ion

Bui

ldin

g

Not

e.W

J-R5

Woo

dcoc

k-Jo

hn

son

-Rev

ised

;SB

:IV

5St

anfo

rdB

inet

Inte

llige

nce

Scal

e—Fo

urth

Edi

tion

;D

AS

5D

iffe

ren

tial

Abi

lity

Scal

es;

K-A

BC

5K

aufm

anA

sses

smen

tB

atte

ryfo

rC

hild

ren

.Str

ong

mea

sure

sof

Gf-G

cfa

ctor

sar

ere

port

edin

bold

-face

dty

pe;m

easu

res

not

inbo

ldty

pear

em

oder

ate

orm

ixed

indi

cato

rsof

Gf-G

cab

iliti

es.P

rim

ary

mea

sure

sof

Gf-G

cfa

ctor

sfo

rth

eW

J-R,

Wec

hsl

er’s

,SB

:IV

,an

dK

-AB

Car

eba

sed

onth

eem

piri

cal

anal

yses

ofW

oodc

ock

(199

0);

Cla

ssifi

cati

ons

for

the

DA

Sar

eba

sed

onth

eem

piri

cal

anal

ysis

ofM

cGh

ee(1

993)

.Fo

rad

diti

onal

info

rmat

ion

onG

f-Gc

fact

orcl

assi

fica

tion

sof

maj

orin

telli

gen

cete

stba

tter

ies

see

McG

rew

(199

4,19

97).

Info

rmat

ion

for

this

tabl

ew

asad

apte

dfr

om‘‘

AC

ross

-Bat

tery

App

roac

hto

Ass

essi

ng

and

Inte

rpre

tin

gC

ogn

itiv

eA

bilit

ies:

Nar

row

ing

the

Gap

Bet

wee

nPr

acti

cean

dC

ogn

itiv

eSc

ien

ce,’’

byD

.P.F

lan

agan

&K

.S.M

cGre

w(1

997)

,in

D.P

.Fla

nag

an,J

.L.G

ensh

aft,

&P.

L.H

arri

son

(Eds

.),C

onte

mpo

rary

Inte

llect

ualA

sses

smen

t:T

heor

ies,

Tes

tsan

dIs

sues

.C

opyr

igh

t19

97by

Gui

lfor

dPu

blis

hin

gC

ompa

ny.

aSe

con

dary

fact

orlo

adin

g.bO

nly

asu

bset

ofD

AS

and

K-A

BC

subt

ests

wer

ejo

intf

acto

ran

alyz

edby

McG

hee

(199

3)an

dW

oodc

ock

(199

0),r

espe

ctiv

ely.

Th

eref

ore,

only

the

subt

ests

that

wer

ein

clud

edin

thes

ean

alys

esar

ere

port

edin

this

tabl

e.T

he

Gf-G

cfa

ctor

clas

sifi

cati

ons

ofal

lDA

San

dK

-AB

Csu

btes

tsfo

llow

ing

alo

gica

ltas

kan

alys

isan

dex

pert

con

sen

sus

are

repo

rted

inM

cGre

w(1

997)

and

McG

rew

and

Flan

agan

(199

8).

157

158 Journal of School Psychology

Fluid Intelligence (Gf ). The omission of these tests on cognitive batteriesis an important one since many of these abilities have been shown to predictsignificantly reading, mathematics, and writing achievement across the life-span (e.g., McGrew, 1993; McGrew, Flanagan, Keith, & Vanderwood, 1997;McGrew & Hessler, 1995; McGrew & Knopik, 1993). Depending on theinstrument used, most current intelligence tests appear to measure mainlyVisual Processing (Gv), Crystallized Intelligence (Gc), and Short-TermMemory (Gsm). The results and implications of these and other joint factoranalyses have been summarized by McGrew (1994, 1997) and McGrew andFlanagan (1998).

THE THEORETICAL FOUNDATION OF THE KAIT

The KAIT is a new intelligence test battery that was developed from Hornand Cattell’s (1966, 1967) original dichotomous Gf-Gc conception of cogni-tive functioning (i.e., a two-factor, fluid-crystallized model) (Kaufman &Kaufman, 1993). The KAIT appears to operationalize an older theoreticalconception of intelligence and, therefore, it is difficult to interpret froma contemporary Gf-Gc framework. Studies reported in the KAIT manualprovide support for the Gf-Gc dichotomous structure underlying this in-strument (Kaufman & Kaufman, 1993), which the authors interpreted asFluid (Gf ) and Crystallized (Gc) ability. However, this validity evidencemust be viewed cautiously, since the KAIT was not correlated or factor ana-lyzed with validated measures of fluid intelligence (Flanagan, 1995; Flana-gan, Alfonso, & Flanagan, 1994; Keith, 1995). Rather, the KAIT joint factoranalysis studies only included additional tests from the Wechsler batteries(Kaufman & Kaufman, 1993) that do not contain strong measures offluid intelligence (Carroll, 1993a; McGrew & Flanagan, 1996; Woodcock,1990). Factor analyses that include a full range of cognitive abilities be-lieved to be necessary to perform the various KAIT tasks, such as strongmeasures of Gf, are needed to test hypotheses regarding the Gf-Gc modelunderlying this instrument. Through this type of analysis it will be possibleto develop a more complete understanding of the factorial structure of theKAIT (Flanagan, 1995; Flanagan et al., 1994; Keith, 1995).

Purpose

The purpose of the present study is to factor analyze the correlations oftest scores from the WJ-R and KAIT in order to test the replicability of thecontemporary Horn-Cattell Gf-Gc model in a non-White sample and togain a more complete understanding of the Gf-Gc factorial structure of theKAIT. The present factor analyses will include a broad range of cognitiveabilities, including two strong measures of Gf. Therefore, this study is thefirst to examine directly the construct validity of the KAIT’s Fluid scale in

Flanagan and McGrew 159

particular. Given the increasing interest and scientific value in interpretingintelligence tests from broad conceptual and theoretical frameworks thatare grounded in contemporary theory and research (e.g., Carroll, 1993a,1993b, 1997; Elliott, 1990; Flanagan & McGrew, 1995a, 1995b; Keith &Witta, 1997; McGrew, 1994, 1997; McGrew & Flanagan, 1998; Woodcock,1990, in press), results of this analysis may provide preliminary support forthe Gf-Gc structure of intelligence in a non-white sample and the interpre-tation of the KAIT from contemporary Gf-Gc theory.

METHOD

Participants

Participants were 114 students who attended the 6th, 7th, and 8th gradesat a predominantly lower-class parochial school (total enrollment approxi-mately 600 students) in an inner-city community located in the Northeast.These students were selected randomly, from the available pool of returnedparent permission forms (i.e., 132/188 or 70% of returned forms), to par-ticipate in the study. An attempt was made to equalize gender as well asthe number of students from each grade. The final sample consisted of114 students, 56 males and 58 females. An approximately equal numberof these students were from the 6th (n 5 40) and 7th (n 5 43) gradeswhereas students from the 8th grade were slightly underrepresented (n 531). Eighty-five students were African American and 29 students wereHispanic. The sample ranged in age from 10 years, 11 months to 15 years,11 months (M 5 149 months; SD 5 15 months). An examination of themeans and standard deviations presented in Table 3 demonstrates that thepresent sample performed within the average range of ability on 26 ofthe 27 cognitive tests administered.

Procedure

The following tests were administered to all participants in counterbal-anced order to control for systematic sources of bias (e.g., response set):Tests 1 to 14 of the WJ-R COG (Extended Scale), tests 22 (Letter-WordIdentification) and 32 (Reading Vocabulary) of the WJ-R Tests of Achieve-ment (WJ-R ACH; Woodcock & Johnson, 1989, 1990), tests 1 to 10 of theKAIT (Extended Battery), and the Object Assembly (OA) subtest of theWechsler Intelligence Scale for Children-Third Edition (Wechsler, 1991).Previous studies have demonstrated that the tests that comprise the VisualProcessing (Gv) Cluster of the WJ-R COG are generally moderate measuresof this construct (e.g., Woodcock, 1990). These same studies demonstratedthat OA is generally a stronger measure of Gv than either of the two Gvmarkers included in the WJ-R COG battery. Therefore, OA was adminis-

160 Journal of School Psychology

Table 3Descriptive Statistics for WJ-R, KAIT, and WISC-III Tests (N 5 114)

Test M SD

WJ-RMemory for Names 99.7 13.3Memory for Sentences 98.4 13.1Visual Matching 97.1 15.8Incomplete Words 90.1 15.5Visual Closure 96.2 17.7Picture Vocabulary 95.6 13.5Analysis-Synthesis 97.4 14.6Visual-Auditory Learning 100.7 14.6Memory for Words 97.4 13.9Cross Out 96.3 15.2Sound Blending 88.0 15.8Picture Recognition 100.0 19.0Oral Vocabulary 98.0 13.4Concept Formation 98.1 14.3Letter-Word Identification 102.2 14.8Reading Vocabulary 99.2 12.2

KAITDefinitions 9.3 2.7Rebus Learning 9.9 2.9Logical Steps 9.4 2.7Auditory Comprehension 9.2 2.4Mystery Codes 9.7 2.7Double Meanings 9.1 2.6Rebus Delayed Recall 9.8 2.6Auditory Delayed Recall 9.3 2.3Memory for Block Designs 9.9 3.3Famous Faces 9.8 2.8

WISC-IIIObject Assembly 8.9 3.2

Note. WJ-R 5 Woodcock-Johnson-Revised; KAIT 5 Kaufman Adolescent and Adult In-telligence Test; WISC-III 5 Wechsler Intelligence Scale for Children-Third Edition.KAIT and WISC-III test scores are on a scale with a mean of 10 and standard deviationof 3. The WJ-R test scores are on a scale with a mean of 100 and a standard deviationof 15.

tered to ensure adequate representation of the Gv factor in the presentstudy. All tests were administered by upper level school and clinical psychol-ogy graduate students who were trained in the administration and scor-ing procedures of these instruments. Test scaled scores were obtainedusing the norms of the respective test batteries: WJ-R COG tests (M 5 100,SD 5 15), KAIT subtests (M 5 10, SD 5 3), WISC-III subtests (M 5 10,SD 5 3).

Instruments

The WJ-R COG is appropriate for use with individuals aged 24 monthsthrough 951 years. It contains 21 tests of cognitive ability that are divided

Flanagan and McGrew 161

into standard and supplemental batteries. The standard battery containsseven tests, one measure for each of the seven Gf-Gc factors (i.e., Gf, Gc,Gv, Ga, Gs, Gsm, Glr) assessed by the WJ-R COG. The supplemental batterycontains 14 tests. Of these, the first seven (i.e., tests 8–14) provide comple-mentary measures of the above cognitive factors and constitute the addi-tional tests that are necessary to calculate the seven Gf-Gc Cognitive AbilityClusters. The remaining seven tests on the WJ-R COG supplemental battery(i.e., tests 15–21) provide mixed measures of Gf-Gc abilities and may beadministered to derive additional information about an individual’s cogni-tive strengths and weaknesses. Tests 1 to 14 were used in the present study.The WJ-R COG has strong psychometric properties including standardiza-tion, reliability, and validity (see WJ-R Technical Manual ; McGrew et al.,1991) as well as good support for the use of its tests as markers for eightGf-Gc factors (McGhee, 1993; McGhee & Lieberman, 1993; McGhee, Buck-halt, & Phillips, 1994; McGrew et al., 1991; Woodcock, 1990, in press; Yssel-dyke, 1990).

The KAIT is appropriate for use with individuals aged 11 years through851 years. It is comprised of three intelligence scales called Crystallized,Fluid, and Composite Intelligence that have a mean of 100 and a standarddeviation of 15. The KAIT’s Core Battery contains three Fluid subtests (Log-ical Steps, Mystery Codes, Rebus Learning) and three Crystallized subtests(Definitions, Double Meanings, Auditory Comprehension) that have amean of 10 and standard deviation of 3. The KAIT also has an ExpandedBattery that includes the six subtests from the Core Battery plus four othersubtests, one crystallized subtest (Famous Faces), one fluid subtest (Mem-ory for Block Designs), and two memory subtests (Rebus Delayed Recalland Auditory Delayed Recall). For detailed descriptions of the KAIT sub-tests and their underlying cognitive abilities, the reader is referred to Flana-gan, Genshaft, and Boyce (in press), Kaufman and Horn (1996), Kaufmanand Kaufman (1993, 1997), Keith (1997), and McGrew (1997). The psycho-metric properties of the KAIT regarding standardization and reliability aregood (Flanagan, 1995; Kaufman & Kaufman, 1993; Keith, 1995). In addi-tion, the validity evidence reported in the KAIT manual (Kaufman & Kauf-man, 1993) and by independent researchers (Keith, 1997) provide initialsupport for the theoretical model underlying this instrument. The 10 sub-tests that comprise the KAIT Expanded Battery were used in the presentstudy.

Data Analysis

The latent variable analytic method of confirmatory factor analysis (LISREL;Joreskog & Sorbom, 1993) was used to evaluate and compare eight a priorifactor models and one post-hoc factor model.1 Confirmatory factor meth-

1 The correlation matrix can be obtained by contacting either of the authors.

162 Journal of School Psychology

ods, which are a subset of structural equation modeling (SEM) procedures,were used as they are particularly well suited to the formulation and evalua-tion of theoretical models (e.g., Gf-Gc theory), make important contribu-tions to theory development and construct validation, and permit the studyof the empirical characteristics of measurement instruments (e.g., WJ-R,KAIT) (Keith, 1988; Raykov & Widaman, 1995). The application of thismethodology has increased rapidly in the behavioral, social and educa-tional sciences in recent years. For example, these methods have been usedfrequently to address important theoretical questions about the structureof and relations between constructs in behavioral domains (e.g., Bensen &Bandalos, 1992; Byrne, 1989), and have proven to be very useful for evaluat-ing the constructs that underlie intelligence tests (e.g., Keith, 1988, 1997).

Model Specification. Model specification was based on a review of priorresearch and theory, with the resulting models representing different varia-tions of the nine-factor model presented in Figure 1. As per standard pathanalytic form, the circles in Figure 1 represent the latent factors, the rectan-gles the measures or manifest variables (i.e., the tests), the arrows from thecircles to the rectangles the factor loadings, and the single-headed arrowson the rectangles the residuals (combination of error and unique variance)for the tests. The double-headed arrows between the ovals that representthe latent factor correlations have been omitted for readability purposes.

A priori Expanded Contemporary Gf-Gc (WJ-R) Nine-Factor Model.Four models were specified that were based on contemporary Gf-Gc theory,the theory that served as the foundation for the WJ-R cognitive battery.The Expanded Contemporary Gf-Gc (WJ-R) Nine-Factor Model (Figure 1) is de-scribed in detail since all other models can be understood by makingchanges to this model. The term expanded indicates that this model includesa reading factor and separate short-term auditory and visual memory fac-tors, factors not frequently included in most prior WJ-R Gf-Gc research.

The specification of the Crystallized Intelligence (Gc), Fluid IntelligenceGf ), Memory Span (MS-Gsm), Associative Memory (MA-Glr), PhoneticCoding (PC-Ga), and Perceptual Speed (PS-Gs) factors is consistent withthe extant WJ-R/Gf-Gc research literature. The factor labels used in Figure1 (and throughout the results and discussion sections of this manuscript)are consistent with the stratum I narrow factor definitions provided by Car-roll (1993a). These factor labels differ from the Gf-Gc factor labels com-monly used to describe the WJ-R factors (i.e., Gf, Gc, Glr, Gsm, etc.) as theinterpretation of prior WJ-R confirmatory studies tended to confoundCarroll’s stratum II broad Gf-Gc factors with stratum I narrow factors (Mc-Grew, 1997).

The difference between the order of a factor and the stratum to which it isassigned is an important and often overlooked distinction in factor analytic

Flanagan and McGrew 163

Figure 1. A priori Expanded Contemporary Gf-Gc (WJ-R) Nine-Factor Model.

studies. According to Carroll (1993a), ‘‘The order of a factor refers to thepurely operational level of analysis at which it is found. The stratum of afactor would refer to an absolute measure of its degree of generality overthe domain of cognitive abilities’’ (p. 577). Therefore, the order at whicha factor has been operationally isolated in a given factor analysis is indepen-

164 Journal of School Psychology

dent of the stratum to which it is assigned. In most factor-analytic studieshowever, the order of a factor and the stratum to which it is assigned arethe same (Carroll, 1993a). Given that an understanding of the degree ofgenerality that is represented by a battery of cognitive tests is critical tothe test interpretation process, hereinafter factors will be discussed andinterpreted according to strata rather than order. (For a complete list ofGf-Gc stratum I and II ability definitions see Carroll, 1993a.)

Using the work of McGrew (1997) as a guide, we evaluated the breadthof each factor specified in the current study against the definitions providedby Carroll, and used either stratum II broad Gf-Gc factor notations (e.g.,Gv) or the stratum I factor labels followed by their respective stratum IIlabels (e.g., Closure Speed or CS-Gv), depending on the diversity of testsrepresented by each factor. For example, two factors in Figure 1 (i.e., Gcand Gf ) retain the broad Gf-Gc factor labels because task analysis of diver-sity of indicators defining these two factors (McGrew, 1997) shows thatthese are broader factors more consistent with Carroll’s stratum II Gf-Gcfactors. Some factors that have been labeled as stratum II Gf-Gc factors inprevious studies (e.g., the WJ-R Visual Matching and Cross Out based factorlabeled of Gs in Woodcock, 1990) have been given a stratum I label in thisstudy because the indicators that define these factors are measures of onlyone stratum I or narrow factor (e.g., Perceptual Speed). In order for afactor to retain a stratum II broad Gf-Gc factor label, it must be broad; thatis, comprised of two or more different stratum I narrow abilities. Table 4contains the names and definitions of the stratum I factors that were identi-fied in the present study. (For a complete list of stratum I ability definitions,see Carroll, 1993a and McGrew, 1997.)

A consensus-based classification of the KAIT tests (McGrew, 1997), aswell as Kaufman and Kaufman’s (1993) recommended interpretation, sug-gested that the KAIT Logical Steps and Mystery Codes tests should join theWJ-R Analysis-Synthesis and Concept Formation tests as indicators of abroad Gf factor, and, the KAIT Double Meanings, Definitions, FamousFaces, and Auditory Comprehension tests would be clear indicators of Gc.The KAIT Rebus Learning and Delayed Recall-Rebus Learning Tests areclose analogues of the WJ-R Visual-Auditory Learning test (Kaufman &Kaufman, 1993), and thus were included under the Associative Memory(MA-Glr) factor. The KAIT does not contain auditory processing orspeeded tests that would be indicators of the Phonetic Coding (PC-Ga) orPerceptual Speed (PS-Gs) factors included in the model represented inFigure 1.

The specification of the Visual Memory (MV-Gsm) and Closure Speed(CS-Gv) factors is at variance from most of the prior WJ-R research wherethe Visual Closure and Picture Recognition tests typically formed a weakGv factor. Task analysis of the WJ-R Picture Recognition test (McGrew,1997) and the research of McGhee and colleagues (e.g., McGhee & Lieber-

Flanagan and McGrew 165

Table 4Definitions of the Gf-Gc Stratum I Abilities Identified in the Present Study

Fluid Intelligence/Reasoning (Gf )General Sequential Reasoning (RG): Ability to start with stated rules, premises, orconditions, and to engage in one or more steps to reach a solution to a problemInduction (I): Ability to discover the underlying characteristics (e.g., rule, concept,process, trend, class membership) that governs a problem or a set of materials

Crystallized Intelligence/Knowledge (Gc)Language Development (LD): General development, or the understanding of words,sentences, and paragraphs (not requiring reading), in spoken native language skillsLexical Knowledge (VL): Extent of vocabulary that can be understood in terms of correctword meaningsListening Ability (LS): Ability to listen and comprehend oral communicationsGeneral (verbal) Information (K0): Range of general knowledgeInformation about Culture (K2): Range of cultural knowledge (e.g., music, art)

Short-Term Memory (Gsm)Memory Span (MS): Ability to attend to and immediately recall temporally orderedelements in the correct order after a single presentationVisual Memory (MV): Ability to form and store a mental representation or image of avisual stimulus and then recognize or recall it later

Visual Intelligence/Processing (Gv)Closure Speed (CS): Ability to quickly combine disconnected, vague, or partially obscuredvisual stimuli or patterns into a meaningful whole, without knowing in advance what thepattern is

Auditory Intelligence/Processing (Ga)Phonetic Coding (PC): Ability to process speech sounds, as in identifying, isolating, andblending sounds; phonological awareness

Long-Term Retrieval (Glr)Associative Memory (MA): Ability to recall one part of a previously learned but unrelatedpair of items when the other part is presented (i.e., paired-associative learning)

Processing Speed (Gs)Perceptual Speed (P): Ability to rapidly search for and compare visual symbols presentedside-by-side or separated in a visual field

Note. The definitions in this table were provided by Carroll (1993a) and adapted by McGrew (1997). SeeCarroll (1993a) for an extensive list of the 69 stratum I abilities he has identified and defined in his research.

man, 1993) suggested that when combined with the KAIT Memory forBlock Designs test these two tests might form a Visual Memory (MV-Gsm)factor. Task analysis of the WJ-R Visual Closure test and the well-docu-mented visual-spatial nature of the Wechsler Object Assembly test, resultedin the specification of the Closure Speed (CS-Gv) factor. The combinationof the WJ-R Letter-Word Identification (RD-Reading Decoding) and Read-ing Vocabulary (V-Verbal Language Comprehension) was deemed to forma Reading factor (which is a partial measure of Woodcock’s broad Grw fac-tor). As can be seen in Figure 1, all initial a priori models did not allow forany tests to be indicators of more than one factor. Secondary factor loadingsfor tests were added later through post-hoc model readjustment proce-dures.

Finally, the use of common test stimuli (with the original administrationof the tests) in the two KAIT delayed recall tests resulted in the specification

166 Journal of School Psychology

of correlated error terms between the original and delayed versions of thesetwo pairs of tests (represented by the arrows running to and from the rect-angles for each respective pair of tests). The model presented in Figure 1(as well as all other models) were specified with oblique (correlated) fac-tors, a specification consistent with our current understanding of the inter-correlations between cognitive abilities (Carroll, 1993a).

Other a priori expanded contemporary or contemporary Gf-Gc (WJ-R)models. To test Carroll’s (1993a) inclusion of reading factors under thebroad Gc factor, the indicators of the Reading factor (RD/V-Grw) werecombined with the Gc tests to define a broader Gc factor in the ExpandedContemporary Gf-Gc (WJ-R) Eight-Factor model. The Contemporary Gf-Gc (WJ-R) Seven-Factor model differed from the model summarized in Figure 1 bythe combining of the Gc and Reading (RD/V-Grw) factors into a singlebroad Gc factor, and the combining of the Visual Memory (MV-Gsm) andClosure Speed (CS-Gv) factors into a single broad Gv factor. A second varia-tion of this model (Contemporary Gf-Gc (WJ-R) Eight-Factor) was similar tothe seven-factor version with the exception of the separate Gc and Reading(RD/V-Grw) factors being maintained.

The four other a priori models reflected older conceptualizations of thestructure of intelligence. The Gf-Gc (KAIT) Five-Factor model was specified toreflect Kaufman and Kaufman’s (1993) Gf-Gc dichotomous KAIT structure.The Gc factor was a combination of the Gc and Reading (RD/V-Grw) fac-tors in Figure 1. The Gf factor included all the tests of the Gf, Associa-tive Memory (MA-Glr), Visual Memory (MV-Gv), and Closure Speed (CS-Gv) factors in the model in Figure 1. Since the KAIT does not includemeasures of short-term auditory memory (Memory Span), auditory pro-cessing (Phonetic Coding), or speed of processing speed (PerceptualSpeed), the WJ-R tests of these factors were left as the lone indicators ofthese three factors (it was felt that this was a fairer treatment of the Kauf-man and Kaufman KAIT model, in contrast to trying to force all of theWJ-R tests of these abilities into two broad Gf-Gc factors). A second versionof this model (Gf-Gc (KAIT) Six-Factor) only differed in the specification ofseparate Gc and Reading (RD/V-Grw) factors.

To investigate a simple Gf-Gc dichotomous model, a two-factor model(Original Gf-Gc) was specified that included a Gc factor (the combination ofthe Gc and Reading factors from the model in Figure 1) and a very broad Gffactor (all other KAIT and WJ-R tests). Finally, a one-factor model consis-tent with Spearman’s single g factor model (Spearman g) included all testsas indicators of a single general intelligence factor.

Post-Hoc/Readjusted Models. Inspection of the LISREL modification in-dexes (MI) and expected parameter change values (EPCV) from the fourbest fitting a priori models suggested ways to improve the fit of these models.

Flanagan and McGrew 167

The high MI values identified model parameters that, if added to themodel, would improve the overall model fit by reducing the size of the chi-square value. The EPCV estimates predicted the size and sign of parametersthat might be added to the models. Parameters were considered for addi-tion to models only if their respective MI values were among the highestMI values reported, and if the EPCV estimates were positive. Four addi-tional KAIT indicator-factor parameters were added to the four best fittingmodels, namely the two Contemporary Gf-Gc (WJ-R) and two Expanded Con-temporary Gf-Gc (WJ-R) models (such additions would not have appreciablyimproved the fit of the four poorest fitting models to make such changesworthwhile, see Results section).

These additions were the most consistent ‘‘suggestions’’ made by theLISREL program across model results, and all made substantive sense. Therequirement to listen and retain the stories presented in the KAIT AuditoryComprehension test, a process that requires immediate auditory apprehen-sion and memory, was consistent with the nature of the Memory Span (MS-Gsm) factor, and thus, this test was specified to be an indicator of this factor.The ability to draw upon previously acquired knowledge to help answerthe KAIT Auditory Comprehension-Delayed Recall test items was also con-sistent with the specification of this test as an indicator of Gc. Finally, thespecification of the KAIT Definitions and Double Meanings tests as indica-tors of the Reading (RD/V-Grw) factor is consistent with expert-based taskanalysis (McGrew, 1997) and the test author’s own interpretative materialfor these two tests (Kaufman & Kaufman, 1993).

The presence of a number of relatively large MI values for the TD ortheta delta matrix (a finding that can be interpreted as reflecting the pres-ence of correlated error between two variables or the presence of an un-specified factor) (see Rubio & Gillespie, 1995) suggested that a 10-factormodel should be considered. This model (Post-Hoc Expanded ContemporaryGf-Gc [WJ-R]) was identical to the model in Figure 1 (including the fourKAIT indicator-factor parameter additions previously described), withthe exception of the decomposition of the Gf factor into separate Induc-tion (I-Gf ) and General Sequential Reasoning (RG-Gf ) factors, the twokey first-order factors that define a broad stratum II Gf ability (Carroll,1993a). An additional parameter that was not included in the other models,was a correlated error term between KAIT Definitions and WJ-R Letter-Word Identification tests.

Model Estimation and Evaluation. Model specification was followed bythe estimation of model parameters and fit statistics with the iterative maxi-mum likelihood fitting function in the LISREL computer program (PC/DOS-based Version 8.02; Joreskog & Sorbom, 1993). Multiple fit statisticswere used to evaluate the models (Loehlin, 1987; Tanaka, 1993).

The chi-square statistic for different models can be used to compare differ-

168 Journal of School Psychology

ent models, with lower chi-square values suggesting a relatively ‘‘better’’fit to the data. However, the chi-square statistic is known to be unduly sensi-tive to sample size, and thus, other fit statistics are needed to comparecompeting models. Standardized root mean square residual (rmr) valuesbelow .10 are considered to reflect a good fit (Cole, 1987). Conversely, theGoodness-of-fit index (GFI) and Adjusted Goodness-of-fit index (AGFI), whichare analogous to the multiple and adjusted multiple correlation in regres-sion analyses (Tanaka, 1993), provide normed values between 0 and 1, with1.0 being a perfect fit. The Parsimonious Goodness-of-fit index (PGFI) is froma family of parsimony fit indices that penalize models that have a largenumber of parameters in favor of simpler models (Tanaka, 1993). Giventhat the current study was comparing alternative models, comparisons be-tween the size of the respective model fit statistics, and not the absolutevalue of the fit statistics, was of most importance (Tanaka, 1993). Becausepost hoc readjustment procedures can capitalize on chance associations insample data, the fit statistics from the post-hoc models must be viewed withcaution until the models are cross-validated in an independent sample.

In all but the simplest Spearman g and Original Gf-Gc models, the testloadings on factors defined by only two tests were very high (.90s) and oftenshowed ‘‘Heywood cases’’ (one of the test factor loadings exceeding a valueof 1.0). Such Heywood cases are common with samples less than 100 (thecurrent sample was 114) and when only two indicators are used per factor(Loehlin, 1987; Long, 1983). To deal with this problem, a researcher caneither constrain or fix the 1.01 factor loading parameters to 1.0, or, as wasthe choice in the current study, the factor loadings for the two-indicatorfactors can be constrained to be equal.

RESULTS

The goodness-of-fit statistics in Table 5 reveal a clear difference betweenthe fit of the older (i.e., Spearman g, Original Gf-Gc, Gf-Gc [KAIT] Five-Factor, Gf-Gc [KAIT] Six-Factor) and contemporary (the four contemporaryand expanded contemporary Gf-Gc [WJ-R]) a priori models. These resultssuggest that older models of intelligence that posit a single general intelli-gence factor or that maintain a focus on the Gf-Gc dichotomy (even whenadditional Gsm, Ga, and Gs factors are included), are relatively poor repre-sentations of the multidimensionality of the structure of the combinedKAIT and WJ-R tests in this study. Therefore, these four models wheredropped from additional consideration. From these data, it is clear thatsome variation of the more complex contemporary or expanded contempo-rary Gf-Gc models is necessary to fully account for the complexity of humanintelligence in the current sample.

Although there are some differences between the fit statistics for the apriori contemporary Gf-Gc (WJ-R) and two expanded contemporary Gf-Gc

Flanagan and McGrew 169

Table 5Goodness-of-Fit Statistics for WJ-R/KAIT Joint Confirmatory Factor Analyses

Model χ2 df rmr GFI AGFI PGFI

Initial a priori modelsSpearman g 2419.1 322 .16 .42 .32 .36Original Gf-Gc 2110.6 321 .19 .46 .37 .39Gf-Gc (KAIT) Five-Factor 1609.2 311 .21 .51 .41 .42Gf-Gc (KAIT) Six-Factor 1553.0 307 .21 .53 .42 .43Contemporary Gf-Gc (WJ-R) Seven-Factor 897.0 301 .11 .69 .62 .55Contemporary Gf-Gc (WJ-R) Eight-Factor 833.2 294 .10 .71 .63 .55Expanded Contemporary Gf-Gc (WJ-R)

Eight-Factor 665.7 294 .09 .72 .65 .56Expanded Contemporary Gf-Gc (WJ-R)

Nine-Factor 608.4 286 .09 .75 .67 .57Post-Hoc/readjusted a priori models

Contemporary Gf-Gc (WJ-R) Seven-Factor 965.5 302 .08 .68 .60 .55Contemporary Gf-Gc (WJ-R) Eight-Factor 868.5 294 .08 .71 .63 .55Expanded Contemporary Gf-Gc (WJ-R)

Eight-Factor 734.8 296 .05 .71 .64 .56Expanded Contemporary Gf-Gc (WJ-R)

Nine-Factor 636.1 286 .05 .74 .66 .57Post-Hoc Expanded Contemporary Gf-Gc

(WJ-R) 436.1 279 .04 .80 .73 .59

Note. GF1 5 Goodness-of-Fit Index; AGFI 5 Adjusted Goodness-of-Fit Index; PGFI 5 Parsimonious Good-ness-of-fit Index; KAIT 5 Kaufman Adolescent and Adult Intelligence Test; WJ-R 5 Woodcock-Johnson-Revised.

(WJ-R) models, the differences are typically to the second decimal, andare not of major significance. This suggests that on empirical grounds, eachof the four models are equally plausible. The fit statistics for the five post-hoc models were also not dramatically different. However, hierarchical chi-square (p , .05) tests between each respectively more complex model (e.g.,Post-Hoc Expanded Contemporary Gf-Gc and Expanded ContemporaryNine-Factor) found that each more complex model resulted in a significantimprovement in model fit. When integrated with substantive considerationsbased on prior research and theory, we believe that arguments can be madein favor of the two post-hoc Expanded Contemporary Gf-Gc (WJ-R) models.Also, the confirmatory factor studies of McGhee and colleagues (e.g.,McGhee & Leiberman, 1993), and, more importantly, Carroll’s (1993a)review of the extant factor analytic research, provide support for separateMemory Span (MS-Gsm) and Visual Memory (MV-Gsm) factors. The rela-tively low (r 5 .20) correlation between the Memory Span (MS-Gsm) andVisual Memory (MV-Gsm) factors in both Expanded Contemporary Gf-Gc(WJ-R) models (see Table 6) also support this distinction.

The remaining issue is whether a model should maintain a distinctionbetween the Gc and Reading factors. The Gc and Reading (RD/V-Grw)latent factor correlation ranged from .83 to .85 for the two Expanded Con-

Tab

le6

Lat

ent

Fact

orIn

terc

orre

lati

ons

for

Exp

ande

dC

onte

mpo

rary

Gf-G

c(W

J-R

)E

ight

-Fac

tor

(Fig

ure

1)an

dP

ost-H

ocE

xpan

ded

Con

tem

pora

ryG

f-Gc

Mod

els

(Fig

ure

3)

Exp

ande

dC

onte

mpo

rary

Gf-G

c(W

J-R)

Eig

ht-F

acto

rM

odel

Gf

Gc

MS

MV

CS

PCM

AR

D/

VPS

Flui

dIn

t.(G

f)—

Cry

st.I

nt.

(Gc)

.45

—M

em.

Span

(MS-

Gsm

).3

4.4

0—

Vis

.M

em.

(MV

-Gsm

).4

0.3

2.2

0—

Clo

s.Sp

eed

(CS-

Gv)

.38

.44

.18*

.59

—Ph

on.

Cod

.(P

C-G

a).1

8*.2

9.1

8*.3

3.3

3—

As.

Mem

.(M

A-G

lr)

.35

.62

.44

.32

.36

.32

—R

eadi

ng

(RD

/V

-Grw

).4

6.8

5.4

5.3

3.4

2.3

7.6

2—

Per.

Spee

d(P

S-G

s).4

8.2

0.0

8*.2

5.3

0.2

7.1

4*.2

5—

Post

-Hoc

Exp

ande

dC

onte

mpo

rary

Gf-G

c(W

J-R)

Mod

el

IR

GG

cM

SM

VC

SPC

MA

RD

/V

PS

Indu

ctio

n(I

-Gf)

—Se

q.R

ea.

(RG

-Gf)

.89

—C

ryst

.In

t.(G

c).4

3.4

4—

Mem

.Sp

an(M

S-G

sm)

.41

.28

.40

—V

is.

Mem

.(M

V-G

sm)

.41

.38

.32

.20

—C

los.

Spee

d(C

S-G

v).3

6.3

6.4

4.1

8*.5

9—

Phon

.C

od.

(PC

-Ga)

.16*

.18*

.29

.18*

.33

.33

Ass

c.M

em.

(MA

-Glr

).3

6.3

3.6

2.4

4.3

2.3

6.3

2—

Rea

din

g(R

D/

V-G

rw)

.45

.45

.83

.48

.34

.44

.36

.61

—Pe

r.Sp

eed

(PS-

Gs)

.48

.46

.20

.08*

.25

.30

.27

.14*

.25

Not

e.M

S5

Mem

ory

Span

;MV

5V

isua

lMem

ory;

CS

5C

losu

reSp

eed;

PC5

Phon

etic

Cod

ing;

MA

5A

ssoc

iati

veM

emor

y;R

D/

V5

Rea

din

gD

ecod

ing/

Ver

bal

(pri

nte

d)L

angu

age

Com

preh

ensi

on;

PS5

Perc

eptu

alSp

eed;

I5

Indu

ctiv

e;R

G5

Gen

eral

Sequ

enti

alR

easo

nin

g;W

J-R5

Woo

dcoc

k-Jo

hn

son

-Rev

ised

.*

Indi

cate

sla

ten

tfa

ctor

corr

elat

ion

sth

atw

ere

not

sign

ifica

ntl

ydi

ffer

ent

from

zero

(p,

.05)

.

170

Flanagan and McGrew 171

Figure 2. Expanded Contemporary Gf-Gc (WJ-R) Nine-Factor Model with Post Hoc KAITIndicator-Factor Parameters.

temporary Gf-Gc (WJ-R) models, a high value that might suggest the needfor a single factor. However, one would need to add approximately fivestandard error of the estimates (.03) to the Gc and Reading (RD/V-Grw)latent factor correlation before it would include the value of 1.0, a valueindicating a single construct. Also, nonfactor analytic evidence in the formof different developmental growth curves for Gc and reading factors sug-gests that these are two different constructs (McGrew et al., 1991). Thus,we believe that the Expanded Contemporary Gf-Gc (WJ-R) Nine-Factor model,with the added post-hoc KAIT indicator-factor parameters, is one of the

172 Journal of School Psychology

Figure 3. Expanded Contemporary Gf-Gc (WJ-R) Ten-Factor Model Derived from MaximumPost Hoc Modeling Readjustment.

most plausible a priori model fit of the WJ-R/KAIT data in this sample. Theresults for this model are presented in Figure 2. The latent factor correla-tions for this model are presented in Table 6.

The 10-factor model that was derived from the maximum amount ofpost-hoc modeling readjustment is presented in Figure 3. As would be ex-pected given the derivation of this model, the fit statistics for the model(see Table 5) and chi-square comparison tests (p , .05) indicate that thismodel fit better than all other models. However, these fit statistics cannotaccurately be compared to the a priori models without cross-validation ina new sample. Two important hypotheses are suggested by the 10-factor

Flanagan and McGrew 173

model in Figure 3. First, the KAIT Definitions test may have additionalvariance associated with reading skills (viz., word recognition) as evidencedby the significant .13 correlated error term between this test and theWJ-R Letter-Word Identification test. Second, two first-order narrow Gffactors representing inductive (Induction-I) and deductive (General Se-quential Reasoning-RG) reasoning might explain the relations between thefour Fluid Intelligence (Gf ) tests used in this study. The WJ-R Analysis-Synthesis and KAIT Logical Steps tests, and the WJ-R Concept Formationand KAIT Mystery Codes tests, may be indicators of deductive and inductivereasoning abilities, respectively. The .89 latent factor correlation betweenthese two factors (see Table 5) reflects highly related factors. However,approximately five standard error of the estimates (i.e., .02) are neededbefore the latent factor correlation range includes a value of 1.0, the valueindicating that the factors are identical. Carroll’s (1993a) review of the Gffactor research, which identifies two such separate abilities, would supportthe Induction (I) and General Sequential Reasoning (RG) distinction rep-resented in Figure 3.

Finally, a review of the latent factor correlations for the models pre-sented in Figures 2 and 3 (see Table 5) provide additional support for thecontemporary Gf-Gc theory over less factorially complex models. With theexception of the two high correlations previously discussed, all of the latentfactor correlations are generally low to moderate in magnitude. The size ofthese correlations indicate that the factors represented in the final two mod-els (Figures 1 and 2) represent distinctly different human ability constructs.

DISCUSSION

Results of the confirmatory factor analyses of the 27 tests from the WJ-R,KAIT, and WISC-III demonstrated that a nine-factor model was the mostplausible explanation of the data when considered within the context ofcontemporary Gf-Gc theory and research. The results of this investigationhave important implications for understanding the structure of cognitiveabilities in non-White populations and for test interpretation.

Given the increasing multicultural nature of the population, the findingthat the best-fitting Gf-Gc models in the current non-White sample aresimilar to the Gf-Gc models found in predominantly White samples in theextant factor analytic research suggests that contemporary Gf-Gc theory maybe useful for assessing and understanding cognitive functioning across dif-ferent racial and ethnic groups. This conclusion is generally consistent withthe available evidence presented by Carroll (1993a) that the Gf-Gc structureof intelligence is invariant across ethnic and culturally diverse groups.Given that this literature base is relatively limited (Carroll, 1993a), the cur-rent study offers additional support regarding this important theoreticalissue.

174 Journal of School Psychology

In considering the present results within the context of test interpreta-tion, several important implications are apparent. First, the KAIT DoubleMeanings and Definitions subtests are mixed measures of Gc and Readingability (RD/V-Grw); the KAIT Auditory Comprehension subtests is a mixedmeasure of Gc and Memory Span (MS-Gsm); and the KAIT Auditory Com-prehension Delayed Recall subtest is a mixed measure of Gc and AssociativeMemory (MA-Glr). Therefore, caution must be exercised when interpret-ing an individual’s performance on these subtests. When an individual dis-plays a strength or weakness on one or more of the KAIT’s mixed factorsubtests, it is recommended that practitioners administer purer measuresof the Gf-Gc abilities that are assessed by these subtests before making inter-pretations regarding test performance. (See McGrew and Flanagan, 1998,for a listing of relatively pure measures of Gf-Gc abilities that may be usedfor this purpose). By contrast, none of the 14 WJ-R tests had mixed (orsecondary) factor loadings in the current study, which is largely consistentwith the extant WJ-R research. Therefore, the WJ-R tests can be consideredrelatively pure measures or indicators of their respective Gf-Gc factors.

A second implication relates to the generalizability of narrow stratum Iindicators to broad stratum II Gf-Gc abilities. The present findings revealedthat of the nine factors that provide the best explanation of the WJ-R andKAIT tests, only two can be considered broad stratum II factors (i.e., Gfand Gc). The Gf and Gc factors in the present study include at least twoqualitatively different stratum I abilities. For example, two inductive reason-ing tests (WJ-R Concept Formation and KAIT Mystery Codes) and two gen-eral sequential reasoning or deductive reasoning tests (WJ-R Analysis-Synthesis and KAIT Logical Steps) are the narrow ability indicators of Gfin the present study. Thus, whether considered together or separately, theWJ-R and KAIT batteries have two qualitatively different narrow ability indi-cators of Gf.

Similarly, several tests of lexical knowledge (WJ-R Picture Vocabulary andOral Vocabulary, KAIT Double Meanings and Definitions), a test of generalinformation and information about culture (KAIT Famous Faces), and atest of language development and listening ability (KAIT Auditory Compre-hension) comprise the stratum I indicators of Gc in the present study.However, although the combination of tests represented in this study showevidence of a broad stratum II Gc factor, when the WJ-R (Cognitive Battery)and KAIT composite scores are considered individually, there is no supportfor a broad Gc factor on either battery. That is, the two Gc stratum I indica-tors on the WJ-R Comprehension-Knowledge Cluster (i.e., Picture Vocabu-lary and Oral Vocabulary) are primarily measures of lexical knowledge.Therefore, interpretations that are made regarding an individual’s per-formance on the WJ-R Comprehension-Knowledge Cluster should bemade at the narrow ability level (e.g., performance reflects primarily lexical

Flanagan and McGrew 175

knowledge ability), since there is not enough diversity among these stratumI indicators of Gc to warrant a broad or stratum II Gc interpretation. Whenusing the WJ-R, in order to generalize about an individual’s broad Gc abil-ity, practitioners will need to supplement their assessment with at least oneor two additional stratum I indicators of Gc (see McGrew & Flanagan, 1998).

Assessment of Gc using the KAIT battery presents a different issue. Thatis, rather than including only one stratum I indicator of Gc, the KAIT in-cludes three (lexical knowledge, information about culture, and listeningability); but, as stated above, with the exception of Famous Faces, theKAIT’s subtests that measure these stratum I abilities (i.e., Double Mean-ings, Definitions and Auditory Comprehension) are factorially complex.Thus, in order to obtain a clearer understanding of an individual’s perfor-mance in the area of crystallized intelligence, practitioner’s should supple-ment the KAIT’s intended Gc subtests with purer stratum I indicators ofGc and ensure that two or more of these narrow abilities are assessed.

A third implication that may be drawn from the present findings is thatmany of the Gf-Gc factors that have been operationally defined in previousjoint factor analyses of cognitive test batteries (e.g., McGhee, 1993; Wood-cock, 1990) have confounded stratum I and II factors. Although the presentresults largely support the extant WJ-R research on the structure of thiscognitive battery, it is important to realize that the WJ-R’s Gs, Ga, Glr, Gsmand Gc factors are stratum I factors. That is, each of these factors are com-prised of two tests that measure the same stratum I ability—PerceptualSpeed (PS), Phonetic Coding (PC), Associative Memory (MA), MemorySpan (MS), and Lexical Knowledge (VL), respectively. With the exceptionof the WJ-R’s Gf factor, none of the Gf-Gc factors include a sufficientbreadth of narrow stratum I abilities to allow the practitioner to generalizeabout an individual’s functioning in the broad stratum II Gf-Gc cognitivedomains. However, practitioner’s need only supplement their measure-ment of each of these Gf-Gc abilities with one qualitatively different andrelatively pure stratum I indicator in order to gain a clearer understand-ing of broad functioning in the Gf-Gc domains that are measured by theWJ-R. For example, the present findings show that the KAIT Famous Facessubtests would provide an excellent stratum I indicator of Gc that is qualita-tively different from those included on the WJ-R.

Another contribution of these findings is that they help to clarify thenature of the two WJ-R Gv tests (i.e., Picture Recognition and Visual Clo-sure). In previous studies of the factorial composition of tests 1 to 14 ofthe WJ-R Cognitive Battery, the Gv factor was not as robust as all otherGf-Gc factors (McGrew et al., 1991; Woodcock, 1990). The present resultsindicate that the WJ-R Gv factor may be weak because Picture Recognitionmay measure primarily Visual Memory (MV; a stratum I indicator of Gsm)and Visual Closure may measure primarily Closure Speed (CS; a stratum I

176 Journal of School Psychology

indicator of Gv). Therefore, the WJ-R Gv cluster should be interpretedcautiously until further research is available to substantiate the underlyingconstructs of the cognitive tests that comprise this factor.

The present findings do not support the organizational structure of theKAIT. The originally proposed fluid-crystallized, two-factor model underly-ing the KAIT is too simple; it does not explain the array of Gf-Gc abilitiesthat are necessary to solve the KAIT tasks. These results show that the KAIThas pure stratum I indicators of four Gf-Gc factors (i.e., Gf, Gc, Glr, Gsm).Of these factors, Gf is the only factor that may be interpreted as a broadstratum II ability. Although the KAIT Gc factor has qualitatively differentstratum I indicators, a broad Gc interpretation should be made cautiouslysince, with the exception of Famous Faces, these indicators are mixed factorsubtests. The KAIT also has two strong measures of Associative Memory(Rebus Learning and Rebus Learning-Delayed Recall), which are narrowindicators of Glr, and one strong measure of Visual Memory (Memory forBlock Designs), which is a narrow indicator of Gsm .

Since this was the first study to factor analyze the KAIT subtests withstrong measures of Fluid Reasoning, these findings provide preliminarysupport for the construct validity of two of the four Fluid subtests on theKAIT battery (i.e., Logical Steps and Mystery Codes). The other purportedFluid subtests of the KAIT (i.e., Rebus Learning and Memory for BlockDesign) are strong indicators of Glr and Gsm, respectively. Therefore, theKAIT Fluid Scale IQ (which is comprised of Logical Steps, Mystery Codesand Rebus Learning), is a confounded (or mixed) measure of Fluid Rea-soning and Associative Memory. The extent to which the KAIT Fluid ScaleIQ is confounded by other Gf-Gc abilities will vary depending on whetheror not the KAIT’s alternate Fluid subtest (Memory for Block Designs) issubstituted for a Core Battery subtest (Flanagan et al., 1994). For example,if Memory for Block Designs is substituted for Rebus Learning, then theFluid Scale IQ will be a confounded measure of Fluid Reasoning and VisualMemory. However, if Memory for Block Designs is substituted for eitherLogical Steps or Mystery Codes, then the Fluid Scale IQ will be a con-founded measure of Fluid Reasoning, Associative Memory and Visual Mem-ory. In the latter situation, it appears that the KAIT Fluid Scale IQ wouldprovide a better estimate of an individual’s broad memory ability than his/her fluid reasoning ability.

The KAIT’s Crystallized Scale IQ is also confounded by other Gf-Gc abil-ities. For example, significant Grw loadings for the Definitions and DoubleMeanings subtests as well as the correlated error between Definitions andLetter-Word Identification in post-hoc analyses suggest that these tests in-volve reading ability. As a result, they may be problematic when assessingadolescents and adults who are suspected of having a reading disability.Although it was never the authors’ intention to construct pure Fluid andCrystallized scales (Kaufman & Kaufman, 1993), some researchers contend

Flanagan and McGrew 177

that it is more difficult to interpret mixed as opposed to pure measures ofability (e.g., Flanagan et al., 1994; Woodcock, 1990). Likewise, confoundedmeasures of ability may lead to difficulty in understanding individual cogni-tive strengths and weaknesses and in planning educational interventions.

A number of study limitations suggest fruitful avenues for future re-search. First, the ratio (4.2 :1) of subjects (N 5 114) to variables (N 5 27)for this study was just below the minimum (N 5 5) and recommended(N 5 10) subjects per variable guidelines commonly suggested. However,explicit subject-to-variable sample size guidelines have ‘‘always been influx, passed down from generations of factor analysts in an oral tradition’’(Floyd & Widaman, 1995, p. 289). The ratio and sample size in the currentstudy is close to Streiner’s (1994) 5:1 recommended minimum ratio, a ratiothat is adequate for sample sizes of 100 or more. More importantly, Guadag-noli and Velicer (1988) and Raykov and Widaman (1995) suggest that theissue is more complex than a fixed subject-to-variable ratio and that thereis no clear theoretical and/or empirical foundation for most standardsubject-to-variable rules-of-thumb. Their research suggests that variablesaturation with the factors (as indicated by the size of the factor loadings),together with sample size and the number of variables is more important.They report that when factor loadings are .80 or above, highly stable factorsolutions can be found in samples as small as 50, regardless of the numberof indicators. In the current study, the vast majority of primary factor load-ings for the tests were .90 or above. According to Guadagnoli’s and Velicer’s(1988) conclusions, the presence of such highly factor saturated variablesin the current sample of 114 would suggest that the current results wouldbe stable in additional samples.

Second, although we recognize that the restricted composition and sizeof our sample limits generalization to other populations, knowledge aboutthe factorial structure of Gf-Gc abilities in the current non-White sampleis a contribution to the relatively limited Gf-Gc multicultural literature. Be-cause of the demographically limited nature of the current sample, to-gether with the unresolved issue of what is an ideal sample size for a con-firmatory factor analysis study, we argue for caution in generalizing theresults from this non-White sample to other non-White samples, as well assimilar samples at different developmental levels, until appropriate cross-validation investigations are completed.

Third, the interpretability of future studies will be more meaningful ifat least three qualitatively different indicators for all stratum II factors areincluded. Finally, the current results only evaluated the WJ-R and KAITbatteries within the contemporary Gf-Gc theoretical framework. Alternativemodels based on other theoretical approaches to understanding intelli-gence (e.g., the PASS model; Das, Naglieri, & Kirby, 1994) need to be speci-fied and evaluated.

In conclusion, the present findings contribute to the theoretical litera-

178 Journal of School Psychology

ture by supporting the invariance of a contemporary Gf-Gc model in anon-White sample. In addition, this study showed that when a sufficientlydiverse array of cognitive tests are used, ‘‘older’’ simple (e.g., dichotomous)models are not supported; rather, a contemporary empirically derivedGf-Gc theoretical model of multiple cognitive abilities is supported.It was found that most of the Gf-Gc factors that are represented on theWJ-R and KAIT batteries warrant a narrow stratum I (and not a broad stra-tum II) interpretation. An examination of other major intelligence testsbatteries reveals that the same interpretive limitations are apparent (seeMcGrew, 1997; McGrew & Flanagan, 1998). This suggests that practitionersshould recognize the breadth and depth of coverage of cognitive abilitieson intelligence test batteries and ‘‘cross’’ batteries when necessary to en-sure that a sufficient range of stratum I indicators are assessed before gener-alizing about functioning in broad Gf-Gc domains (Flanagan & McGrew,1995a, 1995b, 1997; McGrew, 1997; McGrew & Flanagan, 1998; Woodcock,1990). It is important to understand what constructs are measured by cogni-tive ability tests such as the KAIT and WJ-R in all cultural groups given theexisting and emerging literature on the differential predictive relationshipsbetween broad and narrow cognitive Gf-Gc abilities and academic and occu-pational achievement (e.g., Friedman, 1995; Geary, 1993; Humphreys, Lub-inski, & Yao, 1993; McGrew, 1993; McGrew et al., 1997) (see also McGrew &Flanagan, 1998 for a comprehensive review). For example, there is a largerbody of evidence that supports a strong relationship between phonologicalprocessing (which is represented by the Phonetic Coding/Ga factor in thepresent study) and reading achievement (see Felton & Pepper, 1995;McBride-Chang, 1995; Wagner & Torgesen, 1987). Moreover, recent re-search with nationally representative samples across the school age yearshas demonstrated that some Gf-Gc abilities are not only important in under-standing and predicting reading achievement, but they also predict signifi-cantly beyond a general ability construct (see McGrew et al., 1997).

The present results were interpreted within the context of contemporaryGf-Gc theory and research (e.g., Horn, 1991; Horn & Noll, 1997) and Car-roll’s (1993a) theoretical framework was used to demonstrate the impor-tance of considering strata rather than order when interpreting factor ana-lytic results. It is recommended that researchers use Carroll’s (1993a)taxonomy and definitions of factors in future joint factor analytic studiesto aid in the interpretation of results. If future factor analytic studies ensurethat a broad range of human cognitive abilities are represented and areinterpreted from empirically supported theoretical frameworks (such as acontemporary Gf-Gc model), then the field of intellectual assessment willbegin to move toward a consistent terminology and theoretical model thatwill aid in understanding human cognitive functioning and that will helpto narrow the gap between cognitive assessment and cognitive science.

Flanagan and McGrew 179

REFERENCESBenson, J., & Bandalos, D. L. (1992). Second-order confirmatory factor analysis of

the Reactions to Tests scale with cross-validation. Multivariate Behavioral Research,27, 459–487.

Bickley, P. G., Keith, T. Z., & Wolfle, L. M. (1995). The three-stratum theory ofcognitive abilities: Test of the structure of intelligence across the life span. Intelli-gence, 20, 309–328.

Byrne, B. M. (1989). Multigroup comparisons and the assumption of equivalentconstruct validity across groups: Methodological and substantive issues. Multivar-iate Behavioral Research, 24, 503–523.

Carroll, J. B. (1983). Studying individual differences in cognitive abilities: Throughand beyond factor analysis. In R. F. Dillon & R. R. Schmeck (Eds.), Individualdifferences in cognition (Vol. 1, pp. 1–33). New York: Academic Press.

Carroll, J. B. (1989). Factor analysis since Spearman: Where do we stand? What dowe know? In R. Kanfer, P. L. Ackerman, & R. Cudeck (Eds.), Abilities, motivation,and methodology (pp. 43–67). Hillsdale, NJ: Lawrence Erlbaum.

Carroll, J. B. (1993a). Human cognitive abilities: A survey of factor-analytic studies . Cam-bridge, UK: Cambridge University Press.

Carroll, J. B. (1993b). What abilities are measured by the WISC-III? [WISC-IIIMonograph]. Journal of Psychoeducational Assessment, 134–143.

Carroll, J. B. (1997). The three-stratum theory of cognitive abilities. In D. P. Flana-gan, J. L. Genshaft, & P. L. Harrison (Eds.), Contemporary intellectual assessment:Theories, tests and issues (pp. 122–130). New York: Guilford Press.

Cattell, R. B. (1941). Some theoretical issues in adult intelligence testing. Psychologi-cal Bulletin, 38, 592.

Cole, D. (1987). Utility of confirmatory factor analysis in test validation research.Journal of Consulting and Clinical Psychology, 55, 584–594.

Das, J. P., Naglieri, J. A., & Kirby, J. R. (1994). Assessment of cognitive processes: ThePASS theory of intelligence . Boston: Allyn and Bacon.

Elliott, C. D. (1990). Differential Ability Scales: Introductory and technical handbook . SanAntonio: The Psychological Corporation.

Felton, R. H., & Pepper, P. P. (1995). Early identification and intervention of pho-nological deficits in kindergarten and early elementary children at risk for read-ing disability. School Psychology Review, 24, 405–414.

Flanagan, D. P. (1995). A review of the Kaufman Adolescent and Adult IntelligenceTest. In J. C. Conoley, & J. C. Impara (Eds.), The twelfth mental measurements year-book (pp. 527–530). Lincoln, NE: Buros Institute.

Flanagan, D. P., Alfonso, V. C., & Flanagan, R. (1994). A review of the KaufmanAdolescent and Adult Intelligence Test: An advancement in cognitive assess-ment? School Psychology Review, 23, 512–525.

Flanagan, D. P., Genshaft, J. L., & Boyce, D. (in press). Test review: Kaufman Adoles-cent and Adult Intelligence Test. Journal of Psychoeducational Assessment .

Flanagan, D. P., Genshaft, J. L., & Harrison, P. L. (Eds.). (1997). Contemporary intellec-tual assessment: Theories, tests and issues . New York: Guilford Press.

Flanagan, D. P., & McGrew, K. S. (1995a). The field of intellectual assessment: Acurrent perspective. The School Psychologist, 49, 7, 14.

Flanagan, D. P., & McGrew, K. S. (1995b). Will you evolve or become extinct? Interpretingintelligence tests from modern Gf-Gc theory . Paper presented at the 27th Annual Con-vention of the National Association of School Psychologists, Chicago, IL.

Flanagan, D. P., & McGrew, K. S. (1997). A cross-battery approach to assessing andinterpreting cognitive abilities: Narrowing the gap between practice and cogni-

180 Journal of School Psychology

tive science. In D. P., Flanagan, J. L., Genshaft, & P. L., Harrison (Eds.), Contempo-rary intellectual assessment: Theories, tests and issues (pp. 314–325). New York: Guil-ford Press.

Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development and re-finement of clinical assessment instruments. Psychological Assessment, 7, 206–299.

Friedman, L. (1995). The space factor in mathematics: Gender differences. Reviewof Educational Research, 65, 22–50.

Geary, D. C. (1993). Mathematical disabilities: cognitive, neuropsychological, andgenetic components. Psychological Bulletin, 114, 345–362.

Guadagnoli, E., & Velicer, W. F. (1988). Relation of sample size to the stability ofcomponent patterns. Psychological Bulletin, 103, 265–275.

Gustafson, J.-E. (1984). A unifying model for the structure of intellectual abilities.Intelligence, 8, 179–203.

Gustafson, J.-E. (1988). Hierarchical models of individual differences in cognitiveabilities. In R. J. Sternberg (Ed.), Advances in the psychology of human intelligence(Vol. 4, pp. 35–71). Hillsdale, NJ: Lawrence Erlbaum.

Harrison, P. L., Kaufman, A. S., Hickman, J. A., & Kaufman, N.L. (1988). A survey oftests used for adult assessment. Journal of Psychoeducational Assessment, 6, 188–198.

Horn, J. L. (1985). Remodeling old models of intelligence. In B. B. Wolman (Ed.),Handbook of intelligence (pp. 267–300). New York: John Wiley.

Horn, J. L. (1988). Thinking about human abilities. In J. R. Nesselroade & R. B.Cattell (Eds.), Handbook of multivariate psychology (rev. ed., pp. 645–865). NewYork: Academic Press.

Horn, J. L. (1991). Measurement of intellectual capabilities: A review of theory. InK. S. McGrew, J. K. Werder, & R. W. Woodcock (Eds.), Woodcock-Johnson technicalmanual (pp. 197–232). Chicago: Riverside.

Horn, J. L. (1994). Theory of fluid and crystallized intelligence. In R. J. Sternberg(Ed.), Encyclopedia of human intelligence (pp. 443–451). New York: Macmillan.

Horn, J. L., & Cattell, R. B. (1966). Refinement and test of the theory of fluid andcrystallized general intelligences. Journal of Educational Psychology, 57, 253–270.

Horn, J. L., & Cattell, R. B. (1967). Age differences in fluid and crystallized intelli-gence. Acta Psychologica, 2, 107–129.

Horn, J. L., & Noll, J. (1997). Human cognition capabilities: Gf-Gc theory. In D. P.Flanagan, J. L. Genshaft, & P. L. Harrison (Eds.), Contemporary intellectual assess-ment: Theories, tests, and issues (pp. 53–91). New York: Guilford.

Humphreys, L. G., Lubinski, D., & Yao, G. (1993). Utility of predicting group mem-bership and the role of spatial visualization in becoming an engineer, physicalscientist, or artist. Journal of Applied Psychology, 78, 250–261.

Joreskog, K., & Sorbom, D. (1993). LISREL 8 users reference guide . Chicago: ScientificSoftware International.

Kaufman, A. S., & Horn, J. L. (1996). Age changes on tests of fluid and crystallizedability for women and men on the Kaufman Adolescent and Adult IntelligenceTest (KAIT) at ages 17 to 94 years. Archives of Clinical Neuropsychology, 11, 97–121.

Kaufman, A. S., & Kaufman, N. L. (1983). K-ABC: Kaufman Assessment Battery forChildren . Circle Pines, MN: American Guidance Service.

Kaufman, A. S., & Kaufman, N. L. (1993). The Kaufman Adolescent and Adult Intelli-gence Test manual . Circle Pines, MN: American Guidance Service.

Kaufman, A. S., & Kaufman, N. L. (1997). The Kaufman Adolescent and AdultIntelligence Test. In D. P. Flanagan, J. L. Genshaft, & P. L. Harrison (Eds.),Contemporary intellectual assessment: Theories, tests and issues (pp. 209–229). NewYork: Guilford Press.

Keith, T. Z. (1988). Research methods in school psychology: An overview. SchoolPsychology Review, 17, 502–520.

Flanagan and McGrew 181

Keith, T. Z. (1994). Intelligence is important, intelligence is complex. School Psychol-ogy Quarterly, 9, 209–221.

Keith, T. Z. (1995). A review of the Kaufman Adolescent and Adult IntelligenceTest. In J. C. Conoley & J. C. Impara (Eds.), The twelfth mental measurements year-book (pp. 530–532). Lincoln, NE: Buros Institute.

Keith, T. Z. (1997). Using confirmatory factor analysis to aid in understanding theconstructs measured by intelligence tests. In D. P. Flanagan, J. L. Genshaft, &P. L. Harrison (Eds.), Contemporary intellectual assessment: Theories, tests and issues(pp. 373–402). New York: Guilford Press.

Keith, T. Z., & Witta, E. L. (1997). Hierarchical and cross-age confirmatory factoranalysis of the WISC-III: What does it measure? School Psychology Quarterly, 12,89–107.

Loehlin, J. C. (1987). Latent variable models: An introduction to factor, path, and struc-tural analysis . Hillsdale, NJ: Lawrence Erlbaum.

Lohman, D. F. (1989). Human intelligence: An introduction to advances in theoryand research. Review of Educational Research, 59, 333–373.

Long, J. (1983). Confirmatory factor analysis: A preface to LISREL . Beverly Hills, CA:Sage.

Macmann, G. M., & Barnett, D. W. (1994). Structural analysis of correlated factors:Lessons from the verbal-performance dichotomy of the Wechsler scales. SchoolPsychology Quarterly, 9, 161–197.

McBride-Chang C. (1995). What is phonological awareness? Journal of EducationalPsychology, 87, 179–192.

McGhee, R. (1993). Fluid and Crystallized Intelligence: Confirmatory Factor Analy-sis of the Differential Abilities Scale, Detroit Tests of Learning Aptitude-3, andWoodcock-Johnson Psycho-Educational Battery-Revised. Journal of Psychoeduca-tional Assessment Monograph Series: WJ-R Monograph, 20–38.

McGhee, R., Buckhalt, J., & Phillips, C. (1994). An investigation of Gf-Gc theory in theolder adult population: Joint factor analyses of the WJ-R and DTLA-A. Paper presentedat the Human Cognitive Abilities Conference in Charlottesville, VA.

McGhee, R., & Lieberman, L. (1993). Gf-Gc theory of human cognition: Differentia-tion of short-term auditory and visual memory factors . Paper presented at the1993 Convention of the Southern Society for Philosophy and Psychology, NewOrleans, LA.

McGrew, K. S. (1993). The relationship between the Woodcock-Johnson Psych-Educational Battery-Revised Gf-Gc cognitive clusters and reading achievementacross the life-span [Monograph]. Journal of Psychoeducational Assessment, 39–53.

McGrew, K. S. (1994). Clinical interpretation of the Woodcock-Johnson Tests of CognitiveAbility-Revised . Boston: Allyn & Bacon.

McGrew, K. S. (1997). Analysis of the major intelligence batteries according to aproposed comprehensive Gf-Gc framework of human cognitive and knowledgeabilities. In D. P. Flanagan, J. L. Genshaft, & P. L. Harrison (Eds.), Contemporaryintellectual assessment: Theories, tests and issues (pp. 151–180). New York: GuilfordPress.

McGrew, K. S., & Flanagan, D. P. (1996). The Wechsler Performance Scale debate:Fluid intelligence (Gf ) or visual processing (Gv)? Communique, 24(6), 14, 16.

McGrew, K. S., & Flanagan, D. P. (1998). The intelligence test desk reference (ITDR): Gf-Gc cross-battery assessment . Boston: Allyn & Bacon.

McGrew, K. S., Flanagan, D. P., Keith, T. Z., & Vanderwood, M. (1997). Beyond g:The impact of Gf-Gc specific cognitive abilities research on the future use andinterpretation of intelligence test batteries in the schools. School Psychology Review,26, 189–210.

McGrew, K. S., & Hessler, G. L. (1995). The relationship between the WJ-R Gf-Gc

182 Journal of School Psychology

cognitive clusters and mathematics achievement across the life-span. Journal ofPsychoeducational Assessment, 13, 21–38.

McGrew, K. S., & Knopik, S. N. (1993). The relationship between the WJ-R Gf-Gccognitive clusters and writing achievement across the life-span. School PsychologyReview, 22, 687–695.

McGrew, K. S., Werder, J. K., & Woodcock, R. W. (1991). Woodcock-Johnson technicalmanual . Chicago: Riverside.

Messick, S. (1992). Multiple intelligences or multilevel intelligence? Selective em-phasis on distinctive properties of hierarchy: On Gardner’s Frames of Mind andSternberg’s Beyond IQ in the context of theory and research on the structureof human abilities. Psychological Inquiry, 3, 365–384.

Neisser, U., Boodoo, G., Bouchard, T. J., Boykin, A. W., Brody, N., Ceci, S. J., Halp-ern, D. F., Loehlin, J. C., Perloff, R., Sternberg, R. J., & Vrbina, S. (1996). Intelli-gence: Knowns and unknowns. American Psychologist, 51, 77–101.

Raykov, T., & Widaman, K. F. (1995). Issues in applied structural equation modelingresearch. Structural Equation Modeling, 2, 289–318.

Rubio, D. M., & Gillespie, D. F. (1995). Problems with error in structural equationmodels. Structural Equation Modeling, 2, 367–378.

Snow, R. E. (1986). Individual differences and the design of educational programs.American Psychologist, 41, 1029–1039.

Stinnett, T. A., Havey, J. M., & Oehler-Stinnett, J. (1994). Current test usage bypracticing school psychologists: A national survey. Journal of Psychoeducational As-sessment, 12, 331–350.

Streiner, D. L. (1994). Figuring out factors: The use and misuse of factor analysis.Canadian Journal of Psychiatry, 39, 135–140.

Tanaka, J. S. (1993). Multifaceted conceptions of fit in structural equation model-ing. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp.10–39). Newbury Park, CA: Sage.

Thorndike, R. L., Hagen, E. P., & Sattler, J. M. (1986). Stanford-Binet Intelligence Scale(4th ed.). Chicago: Riverside.

Wagner, R. K., & Torgesen, J. K. (1987). The nature of phonological processingand its causal role in the acquisition of reading skills. Psychological Bulletin, 101,192–212.

Wechsler, D. (1981). Wechsler Adult Intelligence Scale-Revised . San Antonio, TX: ThePsychological Corporation.

Wechsler, D. (1989). Wechsler Preschool and Primary Scale of Intelligence-Revised . SanAntonio, TX: The Psychological Corporation.

Wechsler, D. (1991). Wechsler Intelligence Scale for Children-Third Edition . San Antonio,TX: The Psychological Corporation.

Wilson, M. S., & Reschly, D. J. (1996). Assessment in school psychology trainingand practice. School Psychology Review, 25, 9–23.

Woodcock, R. W. (1990). Theoretical foundations of the WJ-R measures of cogni-tive ability. Journal of Psychoeducational Assessment, 8, 231–258.

Woodcock, R. W. (in press). Extending Gf-Gc theory into practice. In J. J. McArdle &R. W. Woodcock (Eds.), Human cognitive abilities in theory and practice . Chicago:Riverside.

Woodcock, R. W., & Johnson, M. B. (1989). Woodcock-Johnson PsychoeducationalBattery-Revised . Chicago: Riverside.

Ysseldyke, J. (1990). Goodness of fit of the Woodcock-Johnson Psycho-EducationalBattery-Revised to the Horn-Cattell Gf-Gc theory. Journal of Psychoeducational As-sessment, 8, 28–275.