Modelling transposition latencies: Constraints for theories of serial order memory
-
Upload
simon-farrell -
Category
Documents
-
view
212 -
download
0
Transcript of Modelling transposition latencies: Constraints for theories of serial order memory
Journal ofMemory and
Journal of Memory and Language 51 (2004) 115–135Language
www.elsevier.com/locate/jml
Modelling transposition latencies: Constraints for theoriesof serial order memoryq
Simon Farrella,* and Stephan Lewandowskyb
a Department of Experimental Psychology, University of Bristol, 8 Woodland Road, Clifton, Bristol BS8 ITN, UKb University of Western Australia, Australia
Received 12 December 2003; revision received 23 March 2004
Available online 20 April 2004
Abstract
Several competing theories of short-term memory can explain serial recall performance at a quantitative level.
However, most theories to date have not been applied to the accompanying pattern of response latencies, thus ignoring
a rich and highly diagnostic aspect of performance. This article explores and tests the error latency predictions of four
alternative mechanisms for the representation of serial order. Data from three experiments show that latency is a
negative function of transposition displacement, such that list items that are reported too soon (ahead of their correct
serial position) are recalled more slowly than items that are reported too late. We show by simulation that these data
rule out three of the four representational mechanisms. The data support the notion that serial order is represented by a
primacy gradient that is accompanied by suppression of recalled items.
� 2004 Elsevier Inc. All rights reserved.
Current theories of memory for serial order can ac-
count for recall performance in intricate detail, captur-
ing not only the basic shape of the serial position curve
but also the pattern of different types of errors (e.g.,
Brown, Neath, & Chater, 2004; Brown, Preece, &
Hulme, 2000; Burgess & Hitch, 1999; Farrell & Le-
wandowsky, 2002; Henson, 1998; Page & Norris, 1998).
This theoretical sophistication has been accompanied by
increasingly fine-grained empirical analysis. For exam-
ple, Surprenant, Kelley, Farley, and Neath (in press)
have shown that probabilities of different types of order
errors, when conditionalised on previous order errors in
a trial, place considerable constraints on models of serial
qPreparation of this paper was facilitated by a Large Grant
and a Discovery Grant from the Australian Research Council
to the second author. During writing of the article, the first
author was partly supported by NIMH Grant HD MH44640
and NIA Grant AG17083-01.* Corresponding author.
E-mail address: [email protected] (S. Farrell).
0749-596X/$ - see front matter � 2004 Elsevier Inc. All rights reserv
doi:10.1016/j.jml.2004.03.007
recall. Haberlandt, Thomas, Lawrence, and Krohn (in
press) made a similar point on the basis of the direc-
tionality of individual order errors. Similarly, subtle
differences in the serial position curves obtained with
lists in which phonologically similar and dissimilar items
are intermixed have been taken to differentiate between
alternative theoretical accounts of similarity effects (e.g.,
Henson, Norris, Page, & Baddeley, 1996 vs. Farrell &
Lewandowsky, 2003).
Although such detailed analyses of response prob-
abilities are undoubtedly valuable, we argue here that
analyses of the associated response latencies, in par-
ticular the latencies of errors, can surpass probability
data in their diagnostic value. Specifically, we first
show by simulation that different representational
principles that are indistinguishable on the basis of the
pattern of errors alone make interestingly different
predictions about the latency of transposition errors.
We then report three experiments that test those pre-
dictions. The data consistently confirm the predictions
of models that combine a primacy gradient with re-
sponse suppression.
ed.
1 We do not consider the representational assumptions of
two models that have not been shown to be capable of
generating the basic transposition gradient; namely ACT-R
(Anderson & Matessa, 1997) and TODAM (Lewandowsky &
Murdock, 1989).
116 S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135
Transposition errors and response latencies in serial recall
Most studies of short-term memory involve serial
recall, which requires participants to report list items in
the order of presentation. Because the emphasis is on
memory for order, much interest has focused on trans-
position errors, which arise when a list item is reported
in an incorrect position, for example when the second
item is recalled first. Transposition errors can be divided
into anticipations, which refer to premature recall of an
item (i.e., ahead of its correct position, for example when
C is recalled first from the list A B C D), and post-
ponements, which refer to the delayed report of an item
(e.g., when C is recalled last from the list A B C D).
Another transposition measure is an item�s displace-ment, which we define as the numeric difference between
the item�s output position and its list (input) position.
Under this metric, postponements have positive trans-
position displacements whereas anticipations have neg-
ative displacements. For example, a displacement of )3would involve recalling an item 3 positions too soon
(e.g., anticipating the fifth item at the second output
position), whereas a +2 displacement would be post-
poning recall of an item by 2 positions (e.g., recalling the
second item at the fourth output position). Correct
responses have a displacement of zero.
When the proportion of responses is plotted as a
function of displacement, the resulting transposition
gradient has several prominent and pervasive features.
First, the gradient peaks at displacement zero, as most
responses are correct. Second, as the absolute displace-
ment increases, the proportion of responses declines,
with most responses confined to small distances—this is
referred to as the ‘‘locality constraint’’ (Page & Norris,
1998). Finally, transposition gradients tend to be sym-
metric; that is, response probabilities are not affected by
the sign of the displacement (Healy, 1974; Henson, 1996;
though see Haberlandt et al., in press).
All current models of serial recall are capable of ac-
commodating at least these three features of transposi-
tion gradients (Brown et al., 2000, 2004; Burgess &
Hitch, 1999; Farrell & Lewandowsky, 2002; Henson,
1998; Lewandowsky, 1999; Page & Norris, 1998). It
follows that although transposition gradients provide a
valuable benchmark for models, they cannot differenti-
ate between competing models. However, it turns out
that further differentiation of models becomes possible
through consideration of the latencies of transpositions.
Despite their known empirical diagnosticity (see
Luce, 1986; for a review), serial recall times have been
generally neglected, primarily because models have not
historically made latency predictions. Nonetheless, a
number of studies have recently begun to measure the
time taken to perform ordered recall. Some have con-
sidered only total output time (e.g., Dosher & Ma, 1998;
Hulme, Newton, Cowan, Stuart, & Brown, 1999), which
limits their theoretical impact, whereas other studies
have provided more stringent constraints by reporting
inter-response times for individual serial positions.
Typically, those studies have found that recall of the first
item takes much longer than report of each subsequent
item (Anderson & Matessa, 1997; Anderson, Bothell,
Lebiere, & Matessa, 1998; Cowan, Wood, Wood, Keller,
Nugent, & Keller, 1998; Maybery, Parmentier, & Jones,
2002; Oberauer, 2003; Thomas, Milner, & Haberlandt,
2003), resulting in an approximately flat serial position
curve for all serial positions except the first. Accord-
ingly, the cumulative latency of responses across output
position forms an approximately linear function in most
cases (e.g., Dosher, 1999).
Notably, all research to date has restricted exami-
nation to correct responses only, or does not distinguish
between latency patterns for different types of errors
(Oberauer, 2003), leaving untouched the patterns of
latencies for recall errors, such as transpositions. This
omission is noteworthy because, as we show next by
simulation, different assumptions about the representa-
tion of serial order lead to very different expectations
concerning transposition latencies.
Representational principles in serial recall
The complexity of contemporary models of serial
recall renders their direct comparison difficult. More-
over, most existing models cannot be applied to latency
data without modification. Accordingly, in this article
we do not compare the full instantiations of models.
Instead, we contrast their predictions by comparing four
widely used representational principles within a single
dynamic architecture. Current models represent serial
order either by: (a) temporal or positional item marking,
by (b) a primacy gradient of activation, by (c) use of
response suppression, and by (d) implementing output
interference, or by some combination of those basic
mechanisms.1 We outline these principles before intro-
ducing the dynamic architecture for our simulations.
Item marking
Item marking refers to the association of items with
some independent representation of order, such as time,
temporal context, or ordinal list position. For example, in
the OSCAR model of Brown et al. (2000), each item is
associated with a timing signal that is provided by an
autonomous set of oscillators. Over time, the pattern of
S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135 117
activation of the oscillators changes, and thus each item is
associatedwith the unique temporal context current at the
time of its presentation.Recall is achieved by ‘‘rewinding’’
the oscillators to their initial values and allowing them to
once again evolve over time, thus reproducing at recall the
temporal contexts that were present at study. Similar
ideas are implemented in the model of Burgess and Hitch
(1999) and temporal distinctiveness accounts of memory
(e.g., SIMPLE; Brown et al., 2004).
A more abstract variant of the same idea is embodied
in models such as SEM (Henson, 1998), in which items
are associated not to a temporal signal (though see
Henson & Burgess, 1997), but to ‘‘positional’’ markers
that identify each list position. A similar idea is incor-
porated in the order-sensitive variant of the feature
model (e.g., Neath, 1999).
The defining properties of item marking are that: (a)
the order of items is not represented by any property of
the items themselves, (b) items are not associated with
each other, and (c) order information is provided by
some independent structure external to the list.
Primacy gradient
Several models assume that the quality of the en-
coding of items decreases across list presentation (Brown
et al., 2000; Farrell & Lewandowsky, 2002; Henson et al.,
1996; Page & Norris, 1998; see also Grossberg, 1978).
This results in a primacy gradient, such that the first list
item is encoded most strongly, the second list item has
the next-strongest encoding, and so on until the last
presented item, which has the weakest encoding strength.
Some models, such as OSCAR (Brown et al., 2000),
incorporate a primacy gradient in combination with
positional marking. Other models rely entirely on the
primacy gradient to recall a list in order by continually
emitting the strongest item (e.g., Farrell & Lewandow-
sky, 2002; Page & Norris, 1998). If the strength of each
recalled item is attenuated by some mechanism (viz.,
response suppression, see below), this mechanism is
sufficient for simple forward recall.
The defining properties of a primacy gradient are: (a)
encoding strength decreases across serial position and
(b) unless it is assisted by an additional mechanism that
encodes order, a primacy gradient is necessarily ac-
companied by response suppression.
Response suppression
There is much evidence that recall of an item is fol-
lowed by its suppression, which renders it temporarily
unavailable for further report. For example, erroneous
repetitions of items during recall are relatively rare
(Henson, 1996, 1998; Vousden & Brown, 1998), and
people have difficulty reporting both occurrences of a
repeated item (Duncan & Lewandowsky, in press;
Henson, 1998). Recent evidence suggests that this re-
sponse suppression is static (that is, it does not wear off
over time), but that items can be released from response
suppression once an entire list has been recalled (Dun-
can & Lewandowsky, in press).
Accordingly, many models incorporate response
suppression (Brown et al., 2000; Burgess & Hitch, 1999;
Henson, 1998; Lewandowsky &Murdock, 1989; Nairne,
1990), and some use it in conjunction with a primacy
gradient to represent order among items without recourse
to any associative process or positional marking (e.g.,
Farrell & Lewandowsky, 2002; Page & Norris, 1998).
The defining features of response suppression are
that: (a) the representation of an item is attenuated or
eliminated following its recall and (b) suppression lowers
the probability of recalling the item again.
Output interference
The act of recalling an item undoubtedly interferes
with the accessibility of items yet to be recalled (see
Anderson & Neely, 1996; for a review). Accordingly,
when input order and output order are dissociated in
serial recall, output interference can be empirically
identified as a source of primacy effects (Cowan, Saults,
Elliott, & Moreno, 2002; Oberauer, 2003). For example,
Oberauer (2003) randomised the temporal input order,
temporal output order, and spatial order of items by
presenting items randomly (in space and time) in a
spatial array of boxes, and then randomly cueing for
ordered responses by successively probing for the letter
that had appeared in each box. Oberauer found that a
primacy effect only appeared when recall accuracy was
plotted by output position (i.e., it did not appear across
input position or spatial position), suggesting that out-
put interference plays a crucial role in the primacy effect.
Despite its apparent theoretical importance, only two
models of serial recall have explicitly incorporated out-
put interference: Brown et al. (2000) found output in-
terference to be necessary for a full account of list length
effects, and Lewandowsky and Murdock (1989) showed
that their model predicted more realistic serial position
curves when output interference was present (see their
Fig. 29).
The single defining feature of output interference is
that the recall of one item degrades the representation or
accessibility of all remaining list items. The deleterious
effect of recall occurs irrespective of whether or not a
recalled item is suppressed, and irrespective of how or-
der is represented.
Models and principles
Table 1 classifies current models of serial recall on the
basis of which of the preceding architectural principles
they embody. These four principles exhaustively
Table 1
Representational mechanisms of contemporary models of short-term serial order memory
Item
marking
Primacy
gradient
Response
suppression
Output
interference
Feature model (Nairne, 1990; Neath, 1999) r r
Primacy model (Page & Norris, 1998) r r
SEM (Henson, 1998) r r
Burgess and Hitch (1999) r r
OSCAR (Brown et al., 2000) r r r r
SOB (Farrell & Lewandowsky, 2002) r r
SIMPLE (Brown et al., 2004) r
A diamond is shown if a model (rows) incorporates the corresponding representational principle (columns).
118 S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135
characterise all existing models that account for the
benchmark transposition gradients. We next show how
these principles, and by implication the models they
appear in, can be differentiated on the basis of their
predicted transposition latencies.
Fig. 1. Generic lateral inhibition network used to implement
model principles. Nodes, representing items, are fully inter-
connected, with excitatory self-connections, and inhibitory
connections between units. Each node has an associated acti-
vation, with the activation changing in response to input from
outside the network, self-excitation, and weighted inhibitory
input from other nodes.
A common response selection architecture
We created a generic architecture that permitted
implementation of the four preceding principles while
also providing straightforward derivation of latency
predictions. The architecture was based on a lateral in-
hibition network, which has been used previously to
model response competition and choice behaviour (e.g.,
Grossberg, 1976; Usher & McClelland, 2001). Our
model is particularly close in form to Houghton�s (1990)competitive queuing model for speech production, in
which response units (representing phonemes) copy an
activation pattern into a lateral inhibition network,
which then performs a ‘‘winner-take-all’’ response se-
lection to select a single unit for output (see also Burgess
& Hitch, 1999). Houghton�s model also incorporates
response suppression to avoid perseverative responding
with the first phoneme in a sequence.
Fig. 1 shows the structure of the network. Each list
item is represented by a node whose activation value can
be taken as the current ‘‘strength’’ of that item in
memory. Nodes are fully interconnected to each other
and also possess self-connections (i.e., connections
which feed back into the nodes from which they origi-
nate). The connections between nodes are inhibitory,
whereas self-connections are excitatory. Initial explora-
tion of the model suggested that stable performance
could be achieved when inhibitory and excitatory con-
nections were set to )0.1 and +1.1, respectively, and
those values were used throughout. The static nature of
the weights implies that no learning occurs in this net-
work; it simply forms a competitive filter (Houghton,
1990) for taking probabilistic information and giving an
unambiguous response, with an associated recall time.
Retrieval proceeds by first setting the activations to
starting values that are determined by the particular
representational principle being modelled (e.g., by cre-
ating a primacy gradient; see below), and then iteratively
passing activation back through the weights. Iterations
continue until the activation of the strongest node ex-
ceeds a response threshold (set to 0.8 throughout), with
the number of cycles required to determine the response
taken to be the model�s recall latency.The activation of a node aj;t at any time t is the
weighted sum of the inputs from all k nodes:
aj;t ¼Xk
i¼1
wijai;t�1: ð1Þ
Activations are updated in parallel at each time step.
Errors arise because of noise in the iterative updating
(cf. Usher & McClelland, 2001), modelled by slight
Gaussian perturbation (l ¼ 0, r ¼ :04) of the activa-
tions at each iteration.
S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135 119
Implementation of four representational principles
The four representational principles were imple-
mented by different settings of the initial node activations
at each output position. The model therefore entails no
commitment to a specific encoding process because re-
sponse selection is agnostic with regard to the mecha-
nism(s) generating the initial activation values. This
componential approach to modelling follows several di-
rect precedents (e.g., Farrell, 2001; Lewandowsky, 1999)
and is compatible with most current practice: nearly all
models of serial recall distinguish the mechanism for or-
dering from a response selection stage (Brown et al., 2000;
Burgess & Hitch, 1999; Henson, 1998; Houghton, 1990;
Page & Norris, 1998). In those models, probabilistic or
incomplete information from the ordering mechanism is
used as input for the response selection stage, which is
then used to select a response from a number of com-
petitors. The response selection stage is also consonant
with the theoretical notion of redintegration, the process
by which long-term knowledge representations are used
to reconstruct degraded traces in short-term memory (cf.
Brown & Hulme, 1995; Lewandowsky, 1999; Lewan-
dowsky & Farrell, 2000; Schweickert, 1993).
Item marking
Following relevant precedents (e.g., Brown et al.,
2004; Henson, 1998), we chose activation values that
directly represented the confusability of item positions.
Accordingly, the node corresponding to the current
Fig. 2. Example starting activation values for four principles of ser
activations result from: (A) item marking (activations shown are for th
output positions); (C) response suppression (showing the first two item
output interference, showing the increase in noise in the networks ac
output position was maximally activated, with the acti-
vation of neighbouring nodes gradually decreasing, thus
embodying the standard assumption that the proximity
of items maps into the similarity among their positional
markers (see Brown et al., 2000; Burgess & Hitch, 1999;
Henson, 1998). Specifically, the activation of node j atoutput position p was given by:
aj ¼ /jj�pj; ð2Þ
where / was set to 0.65 for the initial simulations. A
typical pattern of starting activations associated with
item marking is shown in Fig. 2A. This panel can be
compared to Fig. 10 of Brown et al. (2000), who plotted
such gradients of item activation for their oscillator-
based context signal (see Farrell, 2001, for examples
from other models).
Primacy gradient
As in other models, the primacy gradient was im-
plemented as a decrease in item activations across input
position (Fig. 2B). The same primacy gradient was es-
tablished at all output positions (before being modified
by suppression of previously emitted responses), and
was determined by:
aj ¼ a1cj�1; ð3Þ
where a1 was set to .65, and was set to .9, for the initial
simulations. The exponential form of this primacy
gradient follows precedent (Brown et al., 2000;
ial recall. Starting at the bottom and moving clockwise, these
e third output position); (B) a primacy gradient (constant across
s suppressed in conjunction with a primacy gradient); and (D)
ross output positions.
120 S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135
Lewandowsky, 1999). (Below, we also explore another
instantiation of the gradient with no qualitative change
in results.)
Response suppression
Response suppression was instantiated as the pro-
portional reduction of the activation of any node after
its recall. Hence, at a given output position, the initial
activation of all items was first determined from the
other principles in operation (e.g., a primacy gradient).
The activation of each item that had already been re-
called was then multiplied by a constant proportion
(0.05 in the initial simulations), to yield actual starting
activations. Fig. 2C shows the pattern of starting acti-
vations at the third output position that would result
from a primacy gradient after suppression of the first
two correctly recalled items.
Output interference
The deleterious effects of output interference were
modelled by assuming that recall of an item generally
made memory noisier. Accordingly, the starting activa-
tions were perturbed with (zero-centred Gaussian) noise
whose standard deviation increased with output position
p and was given by :04� p for the initial simulations (see
Fig. 2D, which shows SD of noise rather than activation
of nodes as the other panels).
Fig. 3. Predicted accuracy serial position curves (A), transposition g
latency serial position curves (D), for four models of serial recall. IT
ference. IT+RS: item marking with response suppression. PR+RS:
Summary of model operation
At each output position the item nodes were acti-
vated according to the principle(s) being modeled.
Activations were then allowed to iterate through the
weights until one item was selected as a response be-
cause its activation exceeded a threshold. This item
was taken as the recalled item at that output position,
with the number of iterations providing the response
latency.
Because we were interested specifically in the la-
tency of order errors, omissions (i.e., ‘‘passes’’) and
extra-list intrusions (recall of items not on the list)
were not allowed—these types of responses are rela-
tively rare in serial recall with closed experimental
vocabularies. Nevertheless, the network could easily be
extended to allow for such responses by, respectively,
incorporating a temporal deadline for omissions
(Farrell & Lewandowsky, 2002), and by allowing
non-list items to enter into response competition
(Brown et al., 2000).
Comparison of models
Fig. 3 shows the predictions of four models built
from these representational principles for the recall of 6-
item lists. The accuracy and latency serial position
radients (B), latency serial position curves (C), and cumulative
: item marking only. IT+OI: item marking with output inter-
primacy gradient with response suppression.
Fig. 4. Predicted latency–displacement functions for four
models of serial recall.
2 Repetition responses (i.e., both occurrences of the errone-
ous repetition of an item) are excluded from all LDF analyses
(both model and data) because, in two of the models, they are
expected to behave differently from the non-repeated postpone-
ments or anticipations that were of interest here. Also, for these
qualitative predictions responses from the first output position
were excluded, as these recalls are associated with extremely
long latencies empirically and were excluded from experimental
analyses (see, e.g., Experiment 1 results).
S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135 121
curves are shown in Figs. 3A and C and the cumulative
latency serial position curves in Fig. 3D. Fig. 3B shows
the predicted transposition gradients. Three of the
models involved item marking, either on its own, in
conjunction with response suppression (analogous to the
model of Burgess & Hitch, 1999), or augmented by
output interference. The fourth model involved the
combination of a primacy gradient and response sup-
pression (analogous to the primacy model; Page &
Norris, 1998; and SOB; Farrell & Lewandowsky, 2002;
see also Grossberg, 1978).
The predictions in Fig. 3 are readily summarised: (a)
All models produce a U-shaped accuracy serial posi-
tion curve, although the pure item-marking model ex-
hibits complete symmetry. Complete symmetry is
absent in the data, with the possible exception of re-
construction tasks (e.g., Nairne, 1992). The symmetry
for the item-marking model is unsurprising because it is
a natural consequence of the directionally na€ıve man-
ner in which starting activations were determined (see
Eq. (2)). (b) All models correctly predict a steeply
peaked transposition gradient that is symmetric (item
marking only) or nearly so (the remaining three mod-
els). This approximately symmetric gradient arises from
combinations with item marking because of the as-
sumed similarity between markers: items at adjacent
positions will tend to get activated together because of
similarity in their positional markers. In the primacy
gradient and response suppression model, by contrast,
anticipations and postponements arise from two dif-
ferent processes. Anticipations arise when a later list
item wins the competitive selection process. Because
activations decrease with list position, only the neigh-
bouring later items are likely to win the competition
with the target. In consequence, anticipations tend to
involve small absolute displacements. Postponements
tend to be localised because of ‘‘fill-in’’ (Henson et al.,
1996; Surprenant et al., in press): if an item has not
been anticipated, and if it is not recalled at the correct
position, then it is very likely to be recalled on the next
few occasions because its strength exceeds that of all
remaining competitors. (c) All models predict that la-
tency is inversely related to accuracy. Accordingly, the
latency serial position curves exhibit some primacy and
recency; however, the extent of bowing in the serial
position curves is insufficient to disrupt the (d) ap-
proximate linearity of the cumulative latency serial
position curves for all models. The linearity of cumu-
lative latency is consistent with the data (e.g., Dosher
& Ma, 1998).
The predictions in Fig. 3 thus confirm that the core
results in serial recall can be obtained by a variety of
very different representational mechanisms. However,
the mechanisms can be differentiated on the basis of
their predicted latency–displacement functions, which
are shown in Fig. 4. In a latency–displacement
function (LDF), mean recall latency of transpositions
is plotted as a function of transposition displacement.
The LDF maps directly onto the transposition gradi-
ent, but plots mean latency rather than proportions for
each response.2 It is clear from the figure that the
models� predicted LDF�s differ considerably from each
other.
In particular, for the item-marking model, the LDF
exhibits perfect symmetry, with the latencies of trans-
positions increasing with absolute displacement. This
mirrors the symmetry of the predicted transposition
gradients and the model therefore consistently maintains
an inverse relationship between the probability of a re-
sponse and its latency. When item marking is augmented
with output interference, the predicted LDF becomes
somewhat asymmetric, with postponements having
shorter latencies than anticipations at the same absolute
displacement. The same partial asymmetry arises when
item marking is augmented with response suppression.
An additional consequence of response suppression is
that it reduces the number of repetition errors, which in
the item-marking model are otherwise overly frequent
(and unrealistic when compared with empirical data;
Henson, 1998).
Finally, the model that combines a primacy gradient
with response suppression predicts a uniquely different
122 S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135
LDF that is monotonically negative and shows no ten-
dency to symmetry. Anticipations are slower than
postponements, and whereas the latency of postpone-
ments decreases with increasing displacement, the la-
tency of anticipations increases as they are displaced
further from the correct position. The reason for this
asymmetry is found in the source of anticipation and
postponement errors described above. If an item is an-
ticipated, it will have had to overcome a number of
stronger items from preceding input positions. These
stronger items will take longer to overcome in order for
an anticipated item to be output. Conversely, post-
ponements are relatively fast in this model because they
will involve a strong item being recalled amongst a
number of weak competitors towards the end of the list.
This strong item will not take long to recall because it
will easily overcome these weaker list items with lateral
inhibition. Finally, note that the model does not predict
postponements at large displacements, due to fill-in—the
longer an item is left unrecalled, the higher the condi-
tional probability that it will be recalled at the next
position.
In summary, Fig. 4 clearly demonstrates that the
competing representational principles make divergent
predictions that can be empirically tested. Whereas all
models produced similar patterns for the conventional
accuracy and latency measures, their predicted LDF�sranged from symmetrically V-shaped to monotonically
decreasing.
3 Much research that has compared temporally grouped and
ungrouped lists has kept the sum of all inter-item intervals
constant between list types (e.g., Henson, 1996; Hitch, Burgess,
Towse, & Culpin, 1996). This approach was not followed here
because inter-item intervals were adjusted during pilot testing to
discourage grouping of ungrouped lists while ensuring that
grouped lists were readily perceived as grouped.
Experiment 1
We now present a series of experiments designed to
collect responses for an empirical examination of LDF�s.All experiments in this article employed visual presen-
tation and keyboard recall.
Experiment 1 used temporally grouped and un-
grouped lists of 6 digits. The grouping manipulation was
introduced because it is associated with a particularly
diagnostic pattern of confusions: items that are trans-
posed between groups tend to maintain their relative
position within the group (Henson, 1999; Ryan, 1969).
These transpositions, known as ‘‘interpositions,’’ have
generally been taken as evidence for positional marking
(Henson, 1999). By implication, given that positional
models—as shown previously—predict an inverse map-
ping between response probabilities and latencies,
grouping should not only increase the frequency of in-
terpositions but should also reduce their response la-
tencies. Conversely, if interpositions were not
accelerated, and if the LDF�s for ungrouped and
grouped lists were to be similar, this would present a
challenge for positional models and would point to
other mechanisms or representations driving serial re-
call.
Method
Participants and apparatus
Nineteen undergraduate and postgraduate students
from the Department of Psychology at the University of
WesternAustraliaparticipatedvoluntarily inexchangefor
course credit or remuneration of A$5/h. All participants
received both temporally grouped and ungrouped lists.
Participants had not previously participated in any
serial recall experiments; this was intended to prevent
the spontaneous grouping of ungrouped lists that is
likely to result from experience with the task. As
grouping is known to affect patterns of transpositions
(e.g., Henson, 1999; Ryan, 1969) we sought to minimise
the possibility that people spontaneously grouped stim-
uli in the ungrouped lists.
The experiment was controlled by a PC that presented
all stimuli and collected and scored all responses. The
same apparatus was used in the remaining experiments.
Materials
Lists contained 6 digits that were randomly sampled
without replacement from the set 0 through 9. Ninety lists
of each type (i.e., grouped vs. ungrouped) were con-
structed subject to two constraints: following Henson
(1996), lists did not contain ascending or descending pairs
of integers (e.g., ‘‘3 4,’’ ‘‘7 6’’). Second, an item could not
appear in the same serial position on consecutive lists.
This constraint also applied to all remaining experiments.
The 90 ungrouped lists always preceded the 90
grouped lists. This order was chosen because subjective
grouping was expected to continue once people had been
presented with grouped lists (Henson, 1999).
Procedure
Each trial commenced with the message ‘‘READY,’’
displayed for 1000ms in the centre of the screen. Fol-
lowing a 1500ms blank interval, digits were presented
singly in the central screen position for 200ms each. Lists
were presented at these fast rates to further discourage
spontaneous grouping, and to ensure a sufficient number
of transposition errors for the LDF analysis. Participants
were instructed to read lists silently.
For temporally grouped lists, items were separated
by 200ms, except for the third and fourth item, which
were separated by an additional 600ms, thus yielding
two temporal groups of three items each. For ungrouped
lists, items were separated by a uniform 100 ms.3
S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135 123
The last item was followed by the message ‘‘Recall
list now!’’, and participants then immediately recalled
the list by typing the digits, one by one, on the number
pad of the keyboard using the index finger of the dom-
inant hand. After recall of the last item, participants
were shown a message giving their total recall time, after
which any key press initiated the next trial. There was a
200ms pause before the next ‘‘READY’’ message.
Participants were instructed to recall the digits in
forward order with an emphasis on accuracy rather than
speed. Participants were instructed to report the first
digit that came to mind if they were uncertain. Omis-
sions were not allowed; participants were to provide a
digit at every output position (to maximise frequency of
order errors). Participants were given five practice trials
and a 30-s break was provided every 18 trials. Experi-
mental sessions lasted about 45min.
Results and discussion
For all analyses in this article, responses in the first
output position with a latency of less than 100 ms were
considered to be ‘‘type-aheads’’ and were omitted from
analyses. Any non-zero latency was acceptable in later
output positions.
Fig. 5. Results of Experiments 1 and 2: serial position curves for ac
curves (C), and cumulative latency serial position curves (D).
Accuracy
All accuracy analyses used strict positional scoring,
such that an item was counted correct only if re-
called in its correct position. Preliminary analysis
identified two participants whose performance was at
ceiling (overall accuracy .98). Given the present em-
phasis on error latencies, these individuals were
therefore removed from consideration (inclusion of
these participants does not qualitatively alter the
conclusions).
The serial position curves for correct-in-position re-
call are shown in Fig. 5A. It is clear from the figure that
the grouping manipulation was successful in that per-
formance on the grouped lists was better overall than
performance on the ungrouped lists. Additionally, the
grouping advantage was greater for later serial positions
than earlier ones, as expected from previous studies (e.g.,
Frankish, 1985).
The pattern was statistically confirmed by a
2 (Grouping)� 6 (Serial Position) ANOVA which
revealed significant main effects of grouping, F ð1; 16Þ ¼26:30, MSE ¼ :023, p < :0001, and serial position, F ð5;80Þ ¼ 35:37, MSE ¼ :005, p < :0001, and the expected
interaction between both variables, F ð5; 80Þ ¼ 6:86,MSE ¼ :003, p < :0001.
curacy (A), transposition gradients (B), latency serial position
4 The small absolute number of transpositions rendered it
difficult to fit the regression models to each list type separately.
In particular, for grouped lists, the average number of �2
displacements per participant was less than 3, with even fewer
observations for greater displacements. We therefore report
only the overall analysis across list types.
124 S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135
Transpositions
The transposition gradients are shown in Fig. 5B.
The pattern of transpositions conforms to standard ex-
pectations, although the high accuracy translated into
few displacements beyond immediately adjacent posi-
tions. This in turn prevented the appearance of the usual
grouping effects on transposition probabilities (Henson,
1996, 1999).
Latency
The average latencies associated with correct re-
sponses at each serial position are shown in the bottom
half of Fig. 5, in two ways. Fig. 5C shows average la-
tencies at each serial position. It is clear from the panel
that the first item took considerably longer to recall than
all subsequent items. This is typical for forward serial
recall (e.g., Anderson & Matessa, 1997; Maybery et al.,
2002; Thomas et al., 2003), and is assumed to reflect the
operation of an initial preparatory stage that precedes
the first response (Farrell & Lewandowsky, 2002;
though see Anderson & Matessa, 1997). Because this
preparatory stage is of little theoretical interest, it is de-
emphasised in Fig. 5D, which shows cumulative mean
latencies.
Immediately evident in both panels is the effect of
grouping. Latencies for grouped lists were generally
faster, although the cumulative curves suggest that this
effect was primarily the result of faster responses for the
first item. In support, separate regression lines fitted to
the cumulative curves for grouped and ungrouped lists
differed considerably in intercept (502.5 and 788ms, re-
spectively) but exhibited fairly similar slopes (326 and
360ms/position, respectively). However, grouping also
introduced a marked discontinuity in recall, with longer
latencies for serial position 4, which represents recall of
the first item of the second group. This again mirrors
previous results (Anderson & Matessa, 1997; Maybery
et al., 2002; Oberauer, 2003).
Statistical confirmation of these patterns was ob-
tained by a 2 (Grouping)� 6 (Serial Position) ANOVA,
which revealed a main effect of grouping, F ð1; 18Þ ¼72:09, MSE ¼ 4748, p < :0001, serial position, F ð5; 80Þ¼ 125:22, MSE ¼ 21; 503, p < :0001, and an interaction
between both variables, F ð5; 80Þ ¼ 28:35, MSE ¼ 4033,
p < :0001.
Hierarchical regression analysis of transposition latencies
The recall latencies were subjected to a hierarchical
regression analysis assessing the relationship between
transposition displacement and transposition latency.
Responses in the first output position were excluded
from the LDF�s for regression analyses because of their
extremely long latencies (order errors at the first output
position can only be anticipations, and inclusion of re-
sponse times from the first output position would thus
artificially inflate response times for anticipations). In
consequence, the furthest anticipations for the LDF�sfor regression were )4, corresponding to the last item
being recalled in the second output position, whereas
postponements could span 5 positions (+5), corre-
sponding to the first list item being recalled last. In order
to compensate for an apparent non-linearity, latencies
collected in all experiments were logarithmically trans-
formed for all LDF analyses.
Log-LDF�s were estimated using a hierarchical re-
gression model (Busing, Meijer, & van der Leeden,
1994). Hierarchical regression permits an aggregate
analysis of data from all participants without con-
founding between- and within-participant variability:
regression coefficients are estimated for each participant
separately, but their statistical significance is assessed by
considering the overall pattern of parameters across in-
dividuals. This avoids several potential pitfalls in situa-
tions in which several individuals contribute multiple
observations each to a regression analysis (see Lorch &
Myers, 1990; for a discussion of those problems).
The regression model examined here included an in-
tercept term plus parameters for the transposition dis-
placement (in the range )4 to 5) and for output position
(ranging from 2 to 6). The latter variable was included in
the regression because transposition displacement is
correlated with output position: anticipations will tend
to occur at the start of recall, while postponements will
tend to occur at the end of recall. Of critical interest was
whether the slope of the function relating recall latencies
to transposition displacement was negative when the
effects of output position were accounted for. Regression
parameters were estimated on the basis of all available
responses (i.e., from grouped and ungrouped lists) si-
multaneously.4
The maximum likelihood estimates of those param-
eters (see Busing et al., 1994; for computational details),
averaged across participants, were 5.75, ).042, and
).013 for intercept, displacement, and output position,
respectively. The negative parameter for displacement
indicates that response get faster with increasing (more
positive) transposition displacements, suggesting post-
ponements are faster than responses.
The effect of displacement is shown graphically in
Fig. 6, which shows average latencies at each displace-
ment separately for ungrouped and grouped lists. To
facilitate graphical presentation, the effects of output
position have been subtracted in Fig. 6 (and Fig. 9 for
Experiment 3) by calculating the mean latency for each
output position for each participant, and then
Fig. 6. Latency–displacement functions for Experiments 1
and 2.
S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135 125
subtracting that mean from each individual response
made at that output position (because this filtering re-
moves output position effects position by position, re-
sponses for the first position were retained in the
figures). This filtering renders some latencies negative,
Fig. 7. Estimated slopes from hierarchical regressions relating recall la
for individual participants for a particular experimental condition. T
panels those for Experiment 2, while the bottom panels correspond t
slope of 0.
preventing the use of logarithmic ordinates. Fig. 6
therefore shows the effects of transposition displacement
on latency, with the effects of output position removed.
The apparent negative relationship between latency
and transposition displacement was given qualified sta-
tistical support by the t values accompanying the pa-
rameter estimates, which were 87.41 (p < :0001), 1.86
(p � :06), and )1.73 (p � :08), for intercept, displace-
ment, and output position, respectively. This shows that
anticipations were slower than postponements, although
the effect failed to reach conventional levels of signifi-
cance. The lack of significance was at least partially due
to the presence of large individual differences. To give an
idea of variability in the LDF slopes, Fig. 7 shows the
observed individual slope estimates for all experiments;
those for Experiment 1 are in the top row. Although the
average slope of the individual LDF�s was negative,
several participants exhibited a positive slope.
A limitation of the transposition latency analysis was
that most participants were extremely accurate. Despite
exclusion of participants at the ceiling, the mean accu-
racy in the remaining sample was .86. This ceiling on
accuracy also prevented grouping effects from emerging,
preventing a detailed examination of the latencies of
tency to transposition displacement. Each panel gives estimates
he top row gives slope estimates for Experiment 1, the middle
o Experiment 3. The full vertical line in each panel indicates a
126 S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135
interpositions. The scarcity of transpositions was ad-
dressed in Experiment 2.
Experiment 2
In an attempt to lower the accuracy of recall per-
formance, and thereby increase the number of transpo-
sitions, the second experiment used a larger
experimental vocabulary (consonants rather than digits)
and included a post-presentation interference task on
some trials. As well as lowering recall, the interference
task was intended to determine the generality of the ef-
fects observed in Experiment 1; in particular, we were
interested in ascertaining whether negative LDF�s wouldbe observed at filled retention intervals longer than the
duration commonly cited for the presumed decay of
traces (i.e., around 2 s; see Brown & Hulme, 1995, for a
discussion).
Method
Participants and materials
Fifteen members of the campus community at the
University of Western Australia participated voluntarily
in exchange for course credit or remuneration of A$5/h.
Lists were drawn from a vocabulary of 16 consonants
(B, C, F, G, H, J, K, L, N, P, Q, R, S, V, X, Z). For each
participant, 140 lists were constructed by randomly
sampling six items without replacement from the vo-
cabulary subject to the constraint that no two adjacent
list items could be alphabetically consecutive.
Procedure
All participants participated in 70 contiguous trials
with or without a post-list distractor task, with the order
of list blocks counterbalanced across participants. Each
trial commenced with the word ‘‘READY’’ centrally
presented for 1000ms, followed by a 1000ms blank
pause. List items were then presented for 400ms, with a
100ms inter-item interval.
If a list was followed by distractors, a pause of 500ms
was inserted between the last list item and presentation
of the distractor(s). The 4 distractor digits, randomly
selected from the set 1 through 9, were presented in the
same manner as list items. Participants were instructed
to read list items and distractors aloud.
The recall phase differed slightly from the previous
study in that responses were entered using a 4� 4 grid
on the main keyboard that maintained alphabetical or-
der among vocabulary items and did not conform to
QWERTY lay-out. As before, participants were to type
the letters one by one using their dominant index finger,
and entered letters remained visible on the screen until
the sixth item had been entered. Omissions could be
recorded in this experiment via space bar, but
extra-vocabulary intrusions were prevented. The last
response was followed by feedback in the form of the
total retrieval time.
There were 30-s breaks after every 35 trials. Sessions
lasted under an hour.
Results and discussion
Serial position curves and transposition gradients
One participant�s overall accuracy (.24) was far belowthe mean of the sample, and another participant ap-
peared not to engage with the task on many trials (all
responses consisted of space bar presses). Both partici-
pants were removed from further consideration.
The serial position curves for this experiment are
shown in Fig. 5A.
The obvious deleterious effect of the distractor task
on accuracy was confirmed by the corresponding main
effect in a 2� 6 within-participants ANOVA with
F ð1; 14Þ ¼ 180:70, MSE ¼ :021, p < :0001. The analysis
also showed an effect of serial position, F ð5; 60Þ ¼ 56:99,MSE ¼ :0081, p < :0001, and an interaction between the
two variables, F ð5; 60Þ ¼ 3:40, MSE ¼ :0084, p < :01.The underlying transposition gradients are shown in
Fig. 5B. They confirm the results of the serial position
analysis, with more anticipation and postponement er-
rors for the interference lists.
The latency analysis approximately paralleled the
accuracy results, with a strong effect of serial position,
F ð5; 70Þ ¼ 88:06, MSE¼ 79,508, p < :0001, an interac-
tion between both experimental variables,
F ð5; 70Þ ¼ 3:245, MSE¼ 53,281, p < :05, but no main
effect of distractor task, F ð1; 12Þ ¼ 3:13, p > :10. Re-
gressions using the cumulative latency serial position
curves gave intercepts for the quiet and distractor con-
ditions of 979 and 1385ms, with slopes 1025 and
1059ms/position, respectively (see Figs. 5C and D).
None of these effects were surprising; they merely con-
firm that the distractor manipulation had the expected
effect of reducing accuracy and simultaneously slowing
recall.
Latency–displacement functions
As for Experiment 1, parameters for the latency–
displacement functions were estimated using hierarchical
regression. The increased frequencies of errors in Ex-
periment 2 allowed three analyses to be conducted: one
for each distractor condition separately and one that
combined responses across all trials for each participant.
Table 2 shows regression statistics for the overall anal-
ysis and Fig. 7 the individual parameter estimates.
Replicating Experiment 1, there was a statistically
significant negative relationship between latency and
displacement in the overall analysis. These effects are
illustrated in Fig. 6, which shows the LDF�s for each
condition (subtracting the effect of output position as in
Table 2
Estimated hierarchical regression parameters for Experiment 2
Parameter Estimate SE t p
Combined analysis (all trials)
Intercept 6.80 0.06 105.05 .00
Displacement )0.04 0.01 )4.85 .00
Output position )0.02 0.01 )2.30 .00
Quiet lists
Intercept 6.74 0.06 120.30 .00
Displacement )0.09 0.02 )4.89 .00
Output position 0.00 0.01 )0.41 >.10
Distractor lists
Intercept 6.88 0.09 74.14 .00
Displacement )0.02 0.01 )1.81 .07
Output position )0.04 0.02 )2.63 .01
S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135 127
Experiment 1 to facilitate graphical presentation). In
confirmation of the regression, there was a clear trend
for latencies to decline with transposition displacement,
although that trend was greater for the lists without
interference task. Fig. 7 shows that the displacement
parameter estimates were remarkably stable across in-
dividuals and within individuals across list types. Every
single participant gave rise to a negative displacement
parameter for both list types. This highlights the gen-
erality of the LDF�s found in Experiment 1, not only by
showing their presence using a different experimental
vocabulary, but more importantly, by showing negative
LDF�s at slightly longer retention intervals.
Nonetheless, the size of the displacement effect was
not very large, which may have reflected the restricted
range of possible displacements: transposition distances
in the first two experiments ranged from a minimum of
)5 to a maximum of 5 ()4 to 5 in the hierarchical re-
gression analysis), and there were few transpositions at
large absolute displacements. It follows that a better
assessment of the latency–displacement function can be
obtained by increasing the range of possible displace-
ments and the number of observations at the extremes.
Experiment 3
Experiment 3 differed from the first two studies pri-
marily by using longer lists of 9 digits. The use of longer
lists was motivated by two goals; first, to lower accuracy
of recall and thus increase the frequency of transposition
errors and, second, to replicate the observed negative
LDF�s beyond sub-span list lengths.
Experiment 3 again employed a grouping manipula-
tion—trials in the first half of the experiment involved
ungrouped presentation and the second half involved
grouped lists with pauses after the third and sixth item
(i.e., 3–3–3). It has been suggested that grouping is more
likely in supra-span lists (Henson, 1996), which implies
that the effects of grouping on the LDF�s may be more
apparent with longer lists.
Method
Participants
A new sample of 26 undergraduate and postgraduate
students from the Department of Psychology at the
University of Western Australia participated voluntarily
in exchange for course credit or remuneration of A$5/h.
Materials and procedure
Experiment 3 used the same pool of stimuli and the
same criteria for list construction as Experiment 1, ex-
cept that list length was now 9 items. The remaining
procedural details followed those of Experiment 1, with
three exceptions: first, grouped lists were characterised
by two pauses, inserted after the third and the sixth item,
with all other temporal parameters for both list types
remaining unchanged. Second, there were only 4 prac-
tice trials and, third, each block of 18 trials was followed
by a 30 s break.
Results and discussion
Serial position curves and transposition gradients
Fig. 8 shows the serial position curves for both list
types, with accuracy shown in Fig. 8A and latency
(correct responses only) in Figs. 8C and D. The scal-
loped serial position curve for grouped presentation is
typical of experiments involving temporal grouping and
is particularly pronounced for latency. The intercept
estimates for the cumulative latency serial position
curves were 639 and 652ms, respectively, for the un-
grouped and grouped conditions, with slopes 528 and
502ms/position.
The transposition gradients, shown in Fig. 8B, con-
firm that people treated the grouped and ungrouped lists
differently. In particular, the scalloped form of the
transposition gradient for grouped lists, viz. the flat-
tening of the curve at transposition displacement �3 and
the local peak at transposition displacement +6, indi-
cates that when items were exchanged between groups,
they maintained their within-group position (e.g., Ryan,
1969).
Latency–displacement functions
The hierarchical regression models again included an
intercept term plus parameters for the transposition
distance (in the range )7 to 8) and for output position
(ranging from 2 to 9). As for Experiment 2, there were
enough observations at far displacements to fit separate
models to each list type. Table 3 summarises the pa-
rameter estimates (averaged across participants) for each
list type and for an overall analysis that ignored the
grouping manipulation. Irrespective of list type, there
Fig. 8. Results of Experiment 3: serial position curves for accuracy (A), transposition gradients (B), latency serial position curves (C),
and cumulative latency serial position curves (D).
128 S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135
was a strong and statistically significant negative rela-
tionship between latency and displacement, which is
graphically illustrated in Fig. 9. Although Fig. 9 shows
some unique deviations (for example, around transpo-
sition displacement +5), the overall trend is a negative
one; the further a response was erroneously anticipated,
the slower it was, and the further a response was erro-
neously postponed, the faster it was made. To illustrate
the consistency of this effect across participants, Fig. 7
shows the distribution of individual estimates for the
displacement parameter. The figures hows that those
estimates were negative for every participant for
grouped lists, and for all but three participants in the
ungrouped condition.
Table 3
Estimated hierarchical regression parameters for Experiment 3
Parameter Estimate SE t p
Grouped
Intercept 5.92 0.06 105.36 .00
Displacement )0.03 0.01 )3.78 .00
Output position 0.00 0.01 0.47 >.10
Ungrouped
Intercept 5.93 0.07 84.72 .00
Displacement )0.02 0.01 )3.33 .00
Output position 0.02 0.01 1.46 >.10
A final point worth noting about the LDF�s for Ex-periment 3 is that the observed effect of grouping in the
transposition gradients is not apparent in the corre-
sponding latencies in the LDF�s. The interpositions (i.e.,at transposition displacements )6, )3, +3, and +6) were
not accompanied by shorter recall times at the same
displacement distances in the LDF�s; the only effect of
grouping appeared to be a flattening of the LDF. This
Fig. 9. Latency–displacement functions obtained in Experi-
ment 3.
S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135 129
confirms that probability and latency patterns in serial
recall can be dissociated, and it challenges positional
models, which can handle the pattern of interpositions
but not their independence from the associated latencies.
Summary of experiments
The three experiments consistently showed that la-
tency was a monotonic negative function of transposi-
tion displacement. The generality of this finding is
underscored by the fact that it was observed for different
list lengths (6-item lists in Experiments 1 and 2 and
9-item lists in Experiment 3), stimulus materials (digits
in Experiments 1 and 3 and letters in Experiment 2),
temporal arrangements of lists (grouped vs. ungrouped
in Experiments 1 and 3), and distractor conditions
(Experiment 2). In particular, negative trends were wit-
nessed in the LDF�s over time scales beyond the pre-
sumed duration of decaying traces (e.g., Brown &
Hulme, 1995). Moreover, the negative relationship be-
tween latency and displacement persisted despite the fact
that some of those manipulations caused qualitative
changes in other aspects of recall; for example, grouping
clearly altered the serial position curves and transposi-
tion gradients in Experiment 3 but had no systematic
effects on the associated LDF�s.The generality of the negative latency–displacement
relationship is further underscored by its presence in
other experiments not reported here. For example, the
phonological similarity experiments reported by Farrell
and Lewandowsky (2003), a further unpublished ex-
periment in the authors� laboratory, and a study by
Duncan (1996), are all characterised by negative LDF�s.
5 This scaling of the initial activations determines the
amount of information they provide with respect to the noise
resident in the network. The weighting of initial activations will
also determine the distance of each item�s activation from the
output threshold.
Quantitative modelling of observed LDFs
To confirm the impact of the data on the represen-
tational principles examined at the outset, we now re-
port quantitative fits of the models to the data from
Experiment 3. Although the models predicted similar
serial position curves and transposition gradients
(Fig. 3), the predicted LDF�s differed widely (Fig. 4). In
particular, while any involvement of positional marking
resulted in a V-shaped function relating latency to dis-
placement, the combination of a primacy gradient and
response suppression predicted a monotonic decrease in
recall times with increasing (i.e., more positive) dis-
placement of transpositions. Examination of the ob-
tained LDF�s in Figs. 6 and 9 suggests that the data
mirror the pattern predicted by the primacy-gradient
model, although there is some suggestion of a kink in
these curves for correct responses (i.e., displacement 0).
One restriction of the simulations presented at the
outset was that they yielded qualitative predictions
based on parameter values selected to give realistic serial
position curves and transposition gradients. This does
not preclude the possibility that some of the represen-
tational principles might predict more realistic LDF�sunder different parameter values. To examine this pos-
sibility, each of the earlier models was fit to the accuracy
serial position curve and transposition gradient for the
ungrouped condition of Experiment 3 (these fits did not
consider the latency data). The best-fitting parameter
values were then used to obtain latency predictions (se-
rial position curves and LDF�s) for comparison with
those observed in Experiment 3.
Fitting details
Models were fit to the data for individual subjects,
and the model predictions for individual subjects were
then averaged to give overall predictions. The mod-
elling procedure was identical to that used to generate
the predictions at the outset, except that parameter
values were adjusted (using the simplex algorithm of
Nelder & Mead, 1965) to minimise the RMSD be-
tween the data (summed across accuracy serial posi-
tion curve and transposition gradient) and each
model.
The parameters minimised in the fitting routine
differed between the models. For the pure item-mark-
ing model (IT from here on), the two parameters were
the distinctiveness of the positional markers (/), and
the weighting of the initial activation of this informa-
tion.5 When item marking was augmented by output
interference (IT+OI), three parameters were mini-
mised: the distinctiveness of the markers (/), the
weighting of initial activations, and the weighting of
output interference across output positions. The item-
marking plus response suppression model (IT+RS)
likewise had three free parameters, the first two as for
the pure item-marker model, and a third parameter
that governed the extent of response suppression. Fi-
nally, the primacy-gradient model with response sup-
pression (PR+RS) took as its three free parameters the
steepness of the primacy gradient (c), the starting point
of the gradient (a1; effectively the weighting of this
information as for the other models), and the extent of
response suppression.
Fitting results and discussion
Table 4 presents the minimised RMSD for each
model and each participant, with the value of the
Table 4
Minimised RMSD for individual participants’ data
Participant IT IT+OI IT+RS PR+RS
1 0.22 0.11 0.16 0.11
2 0.22 0.09 0.14 0.07
3 0.22 0.13 0.17 0.11
4 0.35 0.18 0.26 0.11
5 0.40 0.17 0.29 0.13
6 0.14 0.14 0.10 0.14
7 0.08 0.08 0.08 0.17
8 0.32 0.09 0.22 0.09
9 0.36 0.19 0.29 0.21
10 0.45 0.23 0.34 0.19
11 0.40 0.24 0.30 0.19
12 0.38 0.17 0.29 0.09
13 0.32 0.14 0.23 0.08
14 0.30 0.14 0.21 0.09
15 0.22 0.14 0.14 0.12
16 0.41 0.18 0.28 0.12
17 0.30 0.11 0.22 0.11
18 0.24 0.12 0.17 0.10
19 0.38 0.21 0.24 0.11
20 0.41 0.18 0.29 0.09
21 0.34 0.17 0.27 0.09
22 0.20 0.14 0.15 0.12
23 0.42 0.21 0.33 0.14
24 0.31 0.18 0.22 0.13
25 0.37 0.21 0.25 0.11
26 0.33 0.17 0.24 0.09
Mean 0.31 0.16 0.23 0.12
Each column gives the RMSDs for a particular model. Bold
numbers show the lowest RMSD for each row, indicating the
best-fitting model for that participant.
130 S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135
best-fitting model (smallest RMSD) bold-faced for each
participant. Although emphasis here was not on good-
ness-of-fit to the accuracy data, a brief examination of
differences in RMSD between models is in order. Given
that it incorporated one less parameter than the other
models, it is perhaps unsurprising that the pure item-
marking model (IT) gave the poorest fit. This poor fit
can be mostly attributed to the symmetry of the pre-
dicted serial position curves (see Fig. 10 for an example).
The other three models fared better to varying extents.
In particular, the PR+RS model gave somewhat better
fits on average than the IT+OI model, and both these
models in turn performed notably better than the
IT+RS model (note that these models all incorporate
three parameters).
Fig. 10 shows the averaged fit of the models to the
aggregate data. The IT+OI and PR+RS models pre-
dicted more realistic serial position curves (compared
with Fig. 8) than the IT and IT+RS models, and both
gave closer accounts of the observed transposition gra-
dient (Fig. 8B).
Figs. 10C and D show the predicted latency serial
position curves (standard on the left and cumulative on
the right). In order to match the appearance of predic-
tions (Fig. 10) and data (Fig. 8), the predictions for the
first output position included time for presumed prepa-
ratory processes—this was set to a constant 40 iterations
for all models. It is clear that the predictions of all
models differ somewhat from the data (e.g., the un-
grouped condition in Fig. 8C), but in interestingly dif-
ferent ways. Unlike the other models, the PR+RS
model predicts a monotonic latency serial position
curve. This handles the observed slowing down of recall
over the majority of serial positions, but it does not
capture the acceleration for the last two serial positions.
The IT, IT+OI, and IT+RS models, by contrast, pre-
dict this saddle point, but they expect it to be much
earlier in the list than is observed in the data. Despite
these differences, all models capture the approximately
linear pattern of the cumulative latency serial position
curve (Figs. 8D and 10D).
Turning to latency–displacement functions, Fig. 11
shows the predictions obtained with the same parameter
settings that underlie Fig. 10. Unlike the qualitative
predictions presented at the outset (Fig. 4), the effect of
output position in these predicted LDF�s was removed
in exactly the same manner as in the empirical LDF�s toprovide comparability with the data. In addition, to
further enhance graphical comparability, the predicted
LDF�s were converted from model iterations to milli-
seconds using two scaling parameters for each model: an
intercept (in milliseconds), and an iteration-to-millisec-
ond slope, obtained by entering the latency serial posi-
tion curve for the ungrouped condition from
Experiment 3 and its predicted counterpart into a re-
gression as dependent and independent variable, re-
spectively.
The predicted LDF�s shown in Fig. 11 do not
qualitatively differ from those shown at the outset
(Fig. 4), suggesting that the predictions that guided our
research represented core properties of the models and
were not tied to particular parameter values. It is also
immediately apparent from Fig. 11 that the IT and
IT+RS models predict a V-shaped LDF, and the
PR+RS model predicts a monotonic negative function
that is flatter for postponements (to the right of
transposition displacement 0) than for anticipations
(left of 0). The IT+OI model also predicts an asym-
metric V-shaped function, though the variability in
latency is much smaller than for the other models (the
underlying V-shaped predictions of this model were
confirmed by estimates for the slope of the LDF
of )3.88 and. 46 for points left and right of 0, re-
spectively).
Comparison of these predictions to the correspond-
ing data in Fig. 9 provides support for the PR+RS
model—the LDF for the ungrouped condition in
Fig. 10. Fits of models to data from the ungrouped condition in Experiment 3. Panels give serial position curves for accuracy (A),
transposition gradients (B), latency serial position curves (C), and cumulative latency serial position curves (D).
Fig. 11. Predicted latency–displacement functions after fitting
models to data from the ungrouped condition of Experiment 3.
The LDF�s have been converted to ms using scaling parameters
calculated from the latency serial position curves (Fig. 8).
6 Mike Page pointed out that the primacy-gradient model we
implemented does not exactly match the primacy model of Page
and Norris (1998). To ensure the specific form of the primacy
gradient did not unduly affect the results,wefit (to the data froma
subset of participants) another version of the PR+RS model in
which the primacy gradient was linear and decayed across output
positions—this more closely corresponds to the assumptions of
the Page and Norris model. We found this form of the PR+RS
model fit the probability data less well than the model we
employed, but gave qualitatively similar predictions for the
latency serial position curves and LDF�s.
S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135 131
Experiment 3 (the probability data from that condition
were used to estimate parameters) shows an overall
monotonically negative trend, with the function being
flatter for postponements than for anticipations.6 The IT
and IT+RS models, by contrast, predict an excessive
extent of non-monotonicity and the IT+OI model
predicts too shallow a function. Note that although the
PR+RS model deviates somewhat from the empirical
LDF, it was the only one of the four models investigated
that qualitatively captured the results. Considering that
the models were not fitted to the LDF�s directly, the
correspondence between the PR+RS predictions and
the data is noteworthy.
Robustness of predictions
Although the fitting exercise ruled out the possibility
that models other than the PR+RS model might give
132 S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135
respectable accounts of the LDF�s when parameters are
estimated from the data, the preceding predictions are
still based on single (best-fitting) sets of parameter
values. Hence, the preceding simulations do not com-
prehensively show that the LDF�s follow from princi-
ples of the models, and indeed, the models might
predict qualitatively different results for different pa-
rameter values.
To confirm the consistency of the predicted LDF
patterns for each model (cf. Li, Lewandowsky, &
DeBrunner, 1996), we ran a further set of simulations
in which the predictions of the models were examined
for a range of parameter combinations. Parameters in
each model were varied independently from 0.05 to
0.95 in steps of 0.1 and crossed factorially, leading to a
set of points (each a parameter vector) on a grid
covering a large portion of the parameter space of each
model. Thus, for the IT+OI, IT+RS, and PR+RS
models, the 10 values for each of the three parameters
were factorially combined to give 1000 (103) parameter
combinations to be examined (the IT model, having
only two free parameters, was run on 100 parameter
combinations). For each grid point, 1000 replications
of performance on 9-item lists were generated for each
model. The dependent variable of interest was the
slope of the LDF for postponements; since the slope of
anticipations is negative for all models, only post-
ponement latencies serve to discriminate between the
models.7
Fig. 12 shows the distributions of LDF slopes for
postponements for each model. It is clear that the IT
model (Fig. 12A), the IT+OI model (B) and the IT+RS
model (C) all predict a majority of steep positive slopes
for postponements; the percentage of simulations that
returned negative slopes was 1.2, 1.9, and 2.6% for the
respective models. In contrast, the PR+RS model pre-
dicts a majority of negative slopes (much of the distri-
bution lies to the left of 0); the percentage of simulations
returning negative slopes was 75.8%. Moreover, when
the PR+RS model does predict positive slopes, they are
generally quite shallow. Overall, these simulations
clearly confirm that the LDF patterns predicted from
the models follow from the key principles under dis-
cussion, and are not specific to particular parameter
values.
7 The predicted LDF�s had the effects of output position
removed in the same manner as the LDF�s generated from best-
fitting parameter values. In some cases LDF slopes could not be
calculated due to perfect or near-perfect performance (which
eliminates postponements). The reported percentages of slope
value were determined only from simulations that returned
LDF�s from which a slope for postponements could be
calculated.
General discussion
The results of the experiments and of the quantitative
modelling provide consistent support for the notion that
serial recall is driven by a primacy gradient of item
strengths that is coupled with suppression of items once
they have been recalled. No other model consistently
predicts a monotonic negative latency–displacement
function, even when parameters are free to vary to ac-
commodate specific results, or are varied arbitrarily
across a wide range. The data consistently show that the
relationship between latency and displacement is indeed
negative, even across manipulations of variables such as
list length, grouping, and type of material.
Implications for primacy-gradient models
Our theoretical analysis lends support to primacy-
gradient models such as the primacy model (Page &
Norris, 1998) and SOB (Farrell & Lewandowsky, 2002),
complementing other independent sources of evidence
for these models (e.g., Duncan & Lewandowsky, in
press; Farrell & Lewandowsky, 2003). Simulations
conducted previously (Farrell, 2001) complement the
present modelling by showing that the predicted LDF of
SOB is very similar to that predicted by the PR+RS
model in the lateral inhibition framework—given this
similarity, SOB simulations are not reported here (see
Farrell, 2001). Unlike the models under discussion, SOB
naturally accounts for the dynamics of recall by imple-
menting these assumptions in a distributed, recurrent
connectionist network (cf. Anderson, Silverstein, Ritz, &
Jones, 1977).
Although primacy-gradient models have been subject
to recent criticism (e.g., Haberlandt et al., in press;
Henson, 1999; Surprenant et al., in press), those cri-
tiques have been limited to showing that a primacy
gradient alone is insufficient to account for all aspects of
serial recall and needs to be supplemented by some other
ordering mechanism (e.g., item marking; see Page &
Norris, 1998). We do not take issue with those conclu-
sions, as pure primacy-gradient models are indeed in-
sufficient to account for effects such as grouping (e.g.,
Farrell, 2001; Henson, 1996; Page & Norris, 1998). In-
stead, we believe that results such as those presented
here identify a necessary role for the primacy gradient
and response suppression in forward serial recall.
Latencies in models of serial recall
We noted at the outset that one major restriction of
most current models is their inability to make latency
predictions. Our results highlight this restriction by
showing the utility of latency data in differentiating be-
tween models. We next examine ways in which current
models might be extended into the latency domain.
Fig. 12. Distributions of slopes of LDF�s for postponements only, across the parameter space of models. See text for details of
simulations. Panels show distributions for IT model (A); IT+OI model (B); IT+RS model (C); and PR+RS model (D). The heavy
vertical line in each panel represents a slope of 0.
S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135 133
Most current models of serial recall distinguish be-
tween the mechanism for representation of order and a
separate mechanism for response selection and output.
We classify models on the basis of their approach to
response selection.
On the one hand, there are models that select a re-
sponse by some form of a ‘‘winner-take-all’’ process. For
example, the primacy model simply chooses the most
active localist node (Page & Norris, 1998), which is
similar to the architecture used here and the model of
Burgess and Hitch (1999). Alternatively, some models
use a matching process in which the output of the or-
dering mechanism is compared to a pool of recallable
items, the most similar of which is selected for output
(Brown et al., 2000; see also Lewandowsky & Murdock,
1989; Neath, 1999). In those models, the process of se-
lecting a specific item for output could be modelled by a
lateral inhibition network as presented here. All that is
required is that the ordering mechanism in a model
provides information that can be converted into a
starting pattern of activations across the units. This
mapping could be trivially achieved in the primacy
model (Page & Norris, 1998; see also Grossberg, 1978),
where order is already represented in the activations of
localist units. This stage could also be implemented in
distributed models such as OSCAR (Brown et al., 2000)
or TODAM (Lewandowsky & Murdock, 1989) by
instituting some mapping (via weights) from distributed
representations in the memory models to localist repre-
sentations in the lateral inhibition output stage. Al-
though distributed attractor models have been shown to
provide successful accounts of response selection in se-
rial recall (e.g., Lewandowsky, 1999; Lewandowsky &
Farrell, 2000), the lateral inhibition network is an easily
implemented alternative, constituting a simple scheme in
which current connectionist models may be given a dy-
namic aspect.
On the other hand, models such as the feature model
(Nairne, 1990) and SIMPLE (Brown et al., 2004) predict
response probabilities based on the Luce–Shepard
choice rule (Luce, 1963; Shepard, 1957; see Nairne,
2002, for use of such matching rules in memory theory).
These models might also be adapted to generate latency
predictions following precedents in the categorisation
literature. Nosofsky and Palmeri (1997) have shown that
an exemplar-based categorisation model that uses the
Luce–Shepard choice rule to provide response proba-
bilities (Nosofsky, 1986) can be extended into the tem-
poral domain by using the output of the model as input
for a random walk process. Nosofsky and Palmeri
(1997) assumed that individual exemplars race (cf. Lo-
gan, 1988) to become evidence entering into a random
walk process. The random walk continues until the ev-
idence for a response passes a criterion, at which time
134 S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135
that response is made, and the duration of the random
walk is taken as the decision time. Similar adjustments
might be made to SIMPLE and the feature model,
taking the distance between each item and the item to be
recalled (in SIMPLE) or the number of matches between
a short-term trace and traces in a long-term store (in the
feature model) as probabilistic evidence entering into a
random walk process. Indeed, given that Usher and
McClelland (2001) argue that their lateral inhibition
model approximates continuous random walk models
(e.g., Ratcliff, 1978), the architecture presented here
might serve as a response selection tool even in models
that rely on the Luce–Shepard choice rule to determine
response probabilities.
In conclusion, we suggest that any contemporary
model of serial recall has the potential for extension to
latency phenomena. We argue that these extensions are
crucial because we have shown that latency data can
readily constrain and differentiate models of serial order
memory.
Acknowledgments
We gratefully acknowledge assistance from Leo
Roberts at all stages of manuscript preparation and
during data collection for Experiments 2 and 3. We also
thank Mike Page and two anonymous reviewers for
their comments.
References
Anderson, J. A., Silverstein, J. W., Ritz, S. A., & Jones, R. S.
(1977). Distinctive features, categorical perception, and
probability learning: Some applications of a neural model.
Psychological Review, 84, 413–451.
Anderson, J. R., & Matessa, M. (1997). A production system
theory of serial memory. Psychological Review, 104, 728–
748.
Anderson, J. R., Bothell, D., Lebiere, C., & Matessa, M. (1998).
An integrated theory of list memory. Journal of Memory and
Language, 38, 341–380.
Anderson, M. C., & Neely, J. H. (1996). Interference and
inhibition in memory retrieval. In E. L. Bjork & R. A. Bjork
(Eds.), Handbook of perception and memory, Vol. 10:
Memory (pp. 237–313). San Diego: Academic Press.
Brown, G. D. A., & Hulme, C. (1995). Modeling item length
effects in memory span: No rehearsal needed? Journal of
Memory and Language, 34, 594–621.
Brown, G. D. A., Neath, I., & Chater, N. (2004). A local
distinctiveness model of scale-invariant memory and per-
ceptual identification. Manuscript submitted for publica-
tion.
Brown, G. D. A., Preece, T., & Hulme, C. (2000). Oscillator-
based memory for serial order. Psychological Review, 107,
127–181.
Burgess, N., & Hitch, G. J. (1999). Memory for serial order: A
network model of the phonological loop and its timing.
Psychological Review, 106, 551–581.
Busing, F. M. T. A., Meijer, E., & van der Leeden, R. (1994).
MLA software for multilevel analysis of data with two
levels. User�s guide for version 1.0b [software manual].
Leiden, The Netherlands: Leiden University. Available:
http://www.fsw.leidenuniv.nl/www/w3_ment/medewerkers/
busing/MLA.HTM.
Cowan, N., Saults, J. S., Elliott, E. M., & Moreno, M. V.
(2002). Deconfounding serial recall. Journal of Memory and
Language, 46, 153–177.
Cowan, N., Wood, N. L., Wood, P. K., Keller, T., Nugent, L.
D., & Keller, C. V. (1998). Two separate verbal processing
rates contributing to short-term memory span. Journal of
Experimental Psychology: Learning, Memory, and Cogni-
tion, 127, 141–160.
Dosher, B. A. (1999). Item interference and time delays in
working memory: Immediate serial recall. International
Journal of Psychology, 34, 276–284.
Dosher, B. A., & Ma, J.-J. (1998). Output loss or rehearsal
loop. Output-time versus pronunciation-time limits in
immediate recall for forgetting-matched materials. Journal
of Experimental Psychology: Learning, Memory, and Cog-
nition, 24, 316–335.
Duncan, M. (1996). Recognition and serial reconstruction with
pre-cuing and post-cuing. Unpublished manuscript, Univer-
sity of Toronto.
Duncan, M., & Lewandowsky, S. (in press). The time course of
response suppression: No evidence for a gradual release
from inhibition. Memory.
Farrell, S. (2001). Similarity-sensitive encoding, redintegration,
and response suppression in serial recall. Unpublished
Doctoral thesis, University of Western Australia, Perth.
Farrell, S., & Lewandowsky, S. (2002). An endogenous
distributed model of ordering in serial recall. Psychonomic
Bulletin & Review, 9, 59–79.
Farrell, S., & Lewandowsky, S. (2003). Dissimilar items benefit
from phonological similarity in serial recall. Journal of
Experimental Psychology: Learning, Memory, and Cogni-
tion, 29, 838–849.
Frankish, C. (1985). Modality-specific grouping effects in short-
term memory. Journal of Memory and Language, 24, 200–
209.
Grossberg, S. (1976). Adaptive pattern classification and
universal recoding. Biological Cybernetics, 23, 121–
134.
Grossberg, S. (1978). A theory of human memory: Self-
organization and performance of sensory-motor codes,
maps, and plans. In R. Rosen & F. Snell (Eds.), Progress
in theoretical biology (Vol. 5, pp. 233–374). New York:
Academic Press.
Haberlandt, K., Thomas, J. G., Lawrence, H., & Krohn, T. (in
press). Transposition asymmetry in immediate serial recall.
Memory.
Healy, A. F. (1974). Separating item from order information in
short-term memory. Journal of Verbal Learning and Verbal
Behavior, 13, 644–655.
Henson, R. N. A. (1996). Short-term memory for serial order.
Unpublished doctoral dissertation, Cambridge University,
Cambridge, UK.
S. Farrell, S. Lewandowsky / Journal of Memory and Language 51 (2004) 115–135 135
Henson, R. N. A. (1998). Short-term memory for serial order:
The start-end model. Cognitive Psychology, 36, 73–
137.
Henson, R. N. A. (1999). Positional information in short-term
memory: Relative or absolute? Memory & Cognition, 27,
915–927.
Henson, R. N. A., & Burgess, N. (1997). Representations of
serial order. In J. A. Bullinaria, D. W. Glasspool, & G.
Houghton (Eds.), Proceedings of the fourth neural compu-
tation and psychology workshop: Connectionist representa-
tions (pp. 283–300). London: Springer.
Henson, R. N. A., Norris, D. G., Page, M. P. A., & Baddeley,
A. D. (1996). Unchained memory: Error patterns rule out
chaining models of immediate serial recall. Quarterly
Journal of Experimental Psychology A, 49, 80–115.
Hitch, G., Burgess, N., Towse, J. N., & Culpin, V. (1996).
Temporal grouping effects in immediate recall: A working
memory analysis. The Quarterly Journal of Experimental
Psychology A, 49, 116–139.
Houghton, G. (1990). The problem of serial order: A neural
network model of sequence learning and recall. In R. Dale,
C. Mellish, & M. Zock (Eds.), Current research in natural
language generation (pp. 287–319). London: Academic
Press.
Hulme, C., Newton, P., Cowan, N., Stuart, G., & Brown, G.
(1999). Think before you speak: Pauses, memory search,
and trace redintegration processes in verbal memory span.
Journal of Experimental Psychology: Learning, Memory, and
Cognition, 25, 447–463.
Lewandowsky, S. (1999). Redintegration and response sup-
pression in serial recall: A dynamic network model. Inter-
national Journal of Psychology, 34, 434–446.
Lewandowsky, S., & Farrell, S. (2000). A redintegration
account of the effects of speech rate, lexicality, and word
frequency in immediate serial recall. Psychological Research,
63, 163–173.
Lewandowsky, S., & Murdock, B. B. (1989). Memory for serial
order. Psychological Review, 96, 25–57.
Li, S.-C., Lewandowsky, S., & DeBrunner, V. E. (1996). Using
parameter sensitivity and interdependence to predict model
scope and falsifiability. Journal of Experimental Psychology:
General, 125, 360–369.
Logan, G. D. (1988). Toward an instance theory of automa-
tization. Psychological Review, 95, 492–527.
Lorch, R. F., & Myers, J. L. (1990). Regression analyses of
repeated measures data in cognitive research. Journal of
Experimental Psychology: Learning, Memory, and Cogni-
tion, 16, 149–157.
Luce, R. D. (1963). Detection and recognition. In R. D. Luce,
R. R. Bush, & E. Galanter (Eds.), Handbook of mathemat-
ical psychology (pp. 103–189). New York: Wiley.
Luce, R. D. (1986). Response times. New York: Oxford
University Press.
Maybery, M. T., Parmentier, F. B. R., & Jones, D. M. (2002).
Grouping of list items reflected in the timing of recall:
Implications for models of serial verbal memory. Journal of
Memory and Language, 47, 360–385.
Nairne, J. S. (1990). A feature model of immediate memory.
Memory & Cognition, 18, 251–269.
Nairne, J. S. (1992). The loss of positional certainty in long-
term memory. Psychological Science, 3, 199–202.
Nairne, J. S. (2002). The myth of the encoding-retrieval match.
Memory, 10, 389–395.
Neath, I. (1999). Modelling the disruptive effects of irrelevant
speech on order information. International Journal of
Psychology, 34, 410–418.
Nelder, J. A., & Mead, R. (1965). A simplex method for
function minimization. Computer Journal, 7, 308–313.
Nosofsky, R. M. (1986). Attention, similarity, and the identi-
fication–categorization relationship. Journal of Experimen-
tal Psychology: General, 115, 39–57.
Nosofsky, R. M., & Palmeri, T. J. (1997). An exemplar-based
random walk model of speeded classification. Psychological
Review, 104, 266–300.
Oberauer, K. (2003). Understanding serial position curves in
short-term recognition and recall. Journal of Memory and
Language, 49, 469–483.
Page, M. P. A., & Norris, D. (1998). The primacy model: A new
model of immediate serial recall. Psychological Review, 105,
761–781.
Ratcliff, R. (1978). A theory of memory retrieval. Psychological
Review, 85, 59–108.
Ryan, J. (1969). Grouping and short-term memory: Different
means and patterns of grouping. Quarterly Journal of
Experimental Psychology, 21, 137–147.
Schweickert, R. (1993). A multinomial processing tree model
for degradation and redintegration in immediate recall.
Memory & Cognition, 21, 168–175.
Shepard, R. N. (1957). Stimulus and response generalization: A
stochastic model relating generalization to psychological
space. Psychometrika, 22, 325–345.
Surprenant, A. M., Kelley, M. R., Farley, L. A., & Neath, I. (in
press). Fill-in and infill errors in order memory. Memory.
Thomas, J. G., Milner, H. R., & Haberlandt, K. F. (2003).
Forward and backward recall: Different response time
patterns, same retrieval order. Psychological Science, 14,
169–174.
Usher, M., & McClelland, J. L. (2001). On the time course of
perceptual choice: The leaky competing accumulator model.
Psychological Review, 108, 550–592.
Vousden, J. I., & Brown, G. D. A. (1998). To repeat or not to
repeat: The time course of response suppression in sequen-
tial behaviour. In J. A. Bullinaria, D. W. Glasspool, & G.
Houghton (Eds.), Proceedings of the fourth neural compu-
tation and psychology workshop: Connectionist representa-
tions (pp. 301–315). London: Springer-Verlag.