Extensions and applications of stochastic accumulator models in attention and decision ... · 2013....
Transcript of Extensions and applications of stochastic accumulator models in attention and decision ... · 2013....
-
Extensions and applications of
stochastic accumulator models in
attention and decision making
Samuel F. Feng
A Dissertation
Presented to the Faculty
of Princeton University
in Candidacy for the Degree
of Doctor of Philosophy
Recommended for Acceptance
by the Program in
Applied and Computational Mathematics
Adviser: Philip J. Holmes
November 2012
-
c© Copyright by Samuel F. Feng, 2012.
All Rights Reserved
-
Abstract
The research presented in this thesis is a collection of applications and extensions of
stochastic accumulator models to various areas of decision making and attention in
neuroscience.
Ch. 1 introduces the major techniques and experimental results that guide us
throughout the rest of the thesis. In particular, we introduce and define the leaky,
competing accumulator, drift diffusion, and Ornstein-Uhlenbeck models.
In Ch. 2, we adopt an Ornstein-Uhlenbeck (OU) process to fit a generalized version
of the motion dots task in which monkeys are now faced with biased rewards. We
demonstrate that monkeys shift their behaviors in a systematic way, and that they
do so in a near optimal manner. We also fit the OU model to neural data and find
that OU model behaves almost like a pure drift diffusion process. This gives further
evidence that the DDM is a good model for both the behavior and neural activity
related to perceptual choice.
In Ch. 3, we construct a multi-area model for a covert search task. We discover
some new trends in the data and systematically construct a model which explains
the key findings in the data. Our model proposes that the lateral intraparietal area
(LIP) plays an attentional role in this covert search task, and suggests that the two
monkeys used in this study adapted different strategies for performing the task.
In Ch. 4, we extend the model of noise in the popular drift diffusion model (DDM)
to a more general Lévy process. The jumps introduced into the noise increments
dramatically affect the reaction times predicted by the DDM, and they allow the
pure DDM to reproduce fast error trials given unbiased initial data, a feature which
other models require more parameters to reproduce. The model is fit to human
subject data and is shown to outperform the extended DDM in data containing fast
error reaction times.
In Ch. 5, we construct a model for studying capacity constraints on cognitive
iii
-
control using the DDM as a generalized model for a task. After studying various
aspects of the constructed model, large scale simulations demonstrate that a severe
capacity constraint does indeed arise out of the need for optimizing overall rewards.
The thesis concludes with some summarizing remarks in Ch. 6.
iv
-
Acknowledgements
First I would like to thank my thesis adviser Philip Holmes for his intellectual and
personal guidance over these past years. Your candor, integrity, and approach to life
have inspired me more than you realize. I am blessed to have had the opportunity to
be your student.
It has also been a pleasure to work with my collaborators Alan Rorie, William
Newsome, Sam Gershman, and Jonathan Cohen. I am particularly grateful to Alan
Rorie and William Newsome – when we collaborated several years ago I had no idea
how fortunate I was to work with you.
In a category by himself is Michael Schwemmer, both as a friend and a colleague.
Fitting that data was a pain! Your drive and work ethic are infectious, and I will
miss your jokes. I wish you blessings as you move on in your career.
I would also like to thank Carlos Brody and Eric Shea-Brown for taking time to
read this thesis.
I am most thankful for the various friends and family who have made Princeton
my new home. I thank my parents Pen and Janet Feng for their unending love and
support. I thank Philip Eckhoff, Adam Hincks, Arie Israel, Richard Jordan, Jun
Kitagawa, and Ross Willford for their friendship as roommates over the years. I
thank Westerly Road Church for their Christ-centered prayers and encouragement. I
thank my new wife Siyi for being at my side during the highs and lows of graduate
study, and for focusing me on the more important things of life. You remind me of
Christ’s sacrificial love more than anyone else I know.
And finally, all thanks be to God. You are good, your love endures forever.
v
-
Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
1 Introduction to modeling evidence accumulation in two alternative
forced choice (2AFC) tasks 1
1.1 Experimental background of 2AFC perceptual tasks . . . . . . . . . . 2
1.1.1 A crash course in visual processing . . . . . . . . . . . . . . . 3
1.1.2 LIP’s role in perception and attention . . . . . . . . . . . . . . 6
1.2 Building up: constructing drift diffusion (DD) processes via statistical
inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.1 A simple 2AFC statistical inference task . . . . . . . . . . . . 12
1.2.2 The Neyman-Pearson lemma and the sequential probability ra-
tio test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.2.3 The continuum limit of the SPRT . . . . . . . . . . . . . . . . 20
1.2.4 Relevant experimental results concerning drift diffusion pro-
cesses and optimality . . . . . . . . . . . . . . . . . . . . . . . 22
1.3 Working down: Reducing a biologically plausible model to an OU process 24
1.3.1 From spiking neurons to the Leaky Competing Accumulator
model (LCA) . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.3.2 Reduction of LCA to an Ornstein-Uhlenbeck (OU) process . . 27
1.3.3 Working with the OU and DDM models . . . . . . . . . . . . 29
vi
-
1.4 Thesis overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.5 Appendix: Two popular models for decision making . . . . . . . . . . 31
1.5.1 The Extended DDM of Ratcliff . . . . . . . . . . . . . . . . . 32
1.5.2 The Linear Ballistic Accumulator Model . . . . . . . . . . . . 33
2 Can monkeys choose optimally when faced with noisy stimuli and
unequal rewards? 35
2.1 Unequal rewards in the motion dots task . . . . . . . . . . . . . . . . 36
2.2 Predicting psychometric functions (PMFs) with an accumulator model 38
2.2.1 Two approaches for modeling biased rewards: shifted initial
conditions vs. persistent reward signals . . . . . . . . . . . . . 41
2.2.2 Fixed time interrogation reaction times cannot distinguish the
form for integrated drift and noise . . . . . . . . . . . . . . . . 43
2.2.3 Examples of psychometric functions . . . . . . . . . . . . . . . 45
2.3 Optimality analysis of a two parameter psychometric function . . . . 47
2.3.1 A motivating example . . . . . . . . . . . . . . . . . . . . . . 47
2.3.2 Blocks with mixed stimuli: a continuum of coherences . . . . . 49
2.3.3 Blocks with mixed stimuli: finite sets of coherences . . . . . . 51
2.4 Applying the model to experimental data . . . . . . . . . . . . . . . . 53
2.4.1 PMF fits to data averaged over multiple sessions . . . . . . . . 53
2.4.2 How close are the animals, on average, to optimal performance? 57
2.4.3 Variability of behaviors in individual sessions . . . . . . . . . . 61
2.5 Fitting the OU process to the LIP neural data . . . . . . . . . . . . . 63
2.6 Discussion of results and future directions . . . . . . . . . . . . . . . 67
3 Modeling a covert visual search task with a multi-area stochastic
model 71
3.1 Data analysis of the covert search task . . . . . . . . . . . . . . . . . 72
vii
-
3.1.1 LIP encodes target location, limb preference, set-size effect, and
cue-hemifield congruence . . . . . . . . . . . . . . . . . . . . . 75
3.1.2 Accuracy vs reaction time . . . . . . . . . . . . . . . . . . . . 81
3.1.3 Searching for a target . . . . . . . . . . . . . . . . . . . . . . . 82
3.2 A multi-area model for the covert search task . . . . . . . . . . . . . 85
3.3 Fitting methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.4 Model fits and results . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.5 Discussion and concluding remarks . . . . . . . . . . . . . . . . . . . 100
3.6 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
3.6.1 Search analysis computations . . . . . . . . . . . . . . . . . . 102
3.6.2 Model equations . . . . . . . . . . . . . . . . . . . . . . . . . 103
4 Changing the noise in the DDM: jumpy noise can be good 109
4.1 Lévy processes as models for 2AFC reaction times . . . . . . . . . . . 111
4.2 Fitting the Simen et al. behavioral data . . . . . . . . . . . . . . . . 115
4.3 Closing discussion on the jump DDM . . . . . . . . . . . . . . . . . . 120
5 Multitasking vs. Multiplexing: Uncovering capacity constraints on
cognitive control 122
5.1 Stroop as a model for cognitive control . . . . . . . . . . . . . . . . . 124
5.2 Defining many parallel tasks with drift diffusion processes . . . . . . . 127
5.3 Multitasking model equations and key simplifications . . . . . . . . . 128
5.4 Control as reward maximization . . . . . . . . . . . . . . . . . . . . . 133
5.4.1 Free response: maximizing reward rate and optimizing thresholds134
5.4.2 Interrogation: reward rates based solely on accuracy . . . . . . 135
5.4.3 Scaling the overall reward rates . . . . . . . . . . . . . . . . . 136
5.5 Methods for studying capacity constraints . . . . . . . . . . . . . . . 137
5.5.1 Input values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
viii
-
5.5.2 Incongruency . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.5.3 Network connectivity and fan out . . . . . . . . . . . . . . . . 139
5.5.4 Simulating capacity constraints on cognitive control . . . . . . 141
5.6 Model behavior in simple cases . . . . . . . . . . . . . . . . . . . . . 142
5.6.1 Two inputs, one output . . . . . . . . . . . . . . . . . . . . . 143
5.6.2 Two inputs, two outputs . . . . . . . . . . . . . . . . . . . . . 143
5.6.3 Maximizing overall drift rate . . . . . . . . . . . . . . . . . . . 147
5.7 Model simulation results . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.7.1 Full model simulation, 10 pathways . . . . . . . . . . . . . . . 152
5.7.2 Choosing values for peripheral parameters . . . . . . . . . . . 153
5.7.3 Capacity constraints for larger networks . . . . . . . . . . . . 157
5.8 Key conclusions and experimental remarks . . . . . . . . . . . . . . . 159
5.9 Appendix: implicit threshold optimization . . . . . . . . . . . . . . . 162
6 Closing Remarks 164
7 Bibliography 169
ix
-
1 Introduction to modeling evidence
accumulation in two alternative forced
choice (2AFC) tasks
The common thread throughout this entire thesis is the use of a simple stochastic
model for decision making called the drift diffusion model (DDM). This model has
grown in popularity over the past several decades, and dozens of scientific papers
have been published using it to model and explain various phenomenon in the world
of perceptual decision making. The contributions of this thesis are to generalize and
apply the DDM to new situations, and to show that using the DDM (and closely
related accumulator models) is still a valuable technique in understanding simple
decision processes. The DDM provides the right amount of complexity to reproduce
desired behavioral properties and illuminate neural mechanisms while at the same
time remaining tractable enough for mathematical analysis and efficient computation.
The overarching message of the research presented in chapters 2-5 is simple: models
based on the drift diffusion process are useful tools for computational neuroscientists
interested in modeling perceptual decisions, and their potential is still not fully realized.
Each chapter of this thesis carries the DDM (or a close relative) in a new direction by
either generalizing its dynamics or using it as a critical element of a modeling effort
to describe some neural or psychological phenomenon.
The main purpose of this chapter is to survey the mathematical and experimental
1
-
background which has contributed to the success of drift diffusion as a stochastic
model for perceptual decisions. We have been careful to cite articles and resources
whenever appropriate – in some sense, this introduction represents the type of docu-
ment that might aid a new student beginning research in modeling perceptual decision
making.
In §1.1 we recap the body of experimental work which leads to the experimental
data analyzed in chapters 2, 3, and 4. In §1.2 we present a systematic construction of
the drift diffusion model starting from a basic statistical inference task. This section
is written in a more pedagogical manner, although the reader is referred to outside
sources for proofs of the longer results. In §1.3 we give a second account of the DDM
by reviewing a series of computational results over the past decade which demonstrate
how biologically-based models of spiking neurons can be reduced to the DDM. Finally
in §1.4 we give a more detailed description of the rest of this thesis. In the appendix
after this chapter we review two other popular models for decision making.
1.1 Experimental background of 2AFC perceptual
tasks
In this section we present a basic overview of the structures of the mammalian brain
that are implicated in decisions, actions, and choice. The experimental background
and history presented here should communicate a feel for the major lines of thought
that comprise the current understanding of our visual system and how our brains make
simple perceptual decision. Such an exposition is included not only for completeness,
but also in order to properly frame the modeling efforts which form the backbone
of this thesis. We have taken particular care in each chapter to connect our results
with experimental data, and all models have been methodically constructed with the
experimental literature in mind. Our hope is that this survey paints a broad picture
2
-
of this literature and the various experimental phenomenon that motivate the rest of
this thesis.
We begin with a brief overview of the visual processing system in §1.1.1. The
contents of §1.1.1 can be found in any good neuroscience textbook (e.g., [KSJ00,
BCP06]) or in the expository sections of several papers, including [Sch01, MC01,
DD95]. A more detailed overview of the first several decades of results concerning
visual processing in monkeys can be found in [MN87]. §1.1.2 then recounts the more
specific history of the lateral intraparietal area (LIP) as an area involved in perceptual
decision making. We trace two main lines of experiments: LIP as an area which
correlates with planned eye movements and LIP as an area which carries attentional
signals.
1.1.1 A crash course in visual processing
Some of the most complex tasks performed by humans involve decisions of varying
timescales and complexity. Not surprisingly, neural correlates of deciding, choosing,
and acting occur in numerous areas of the mammalian brain. We focus mainly on
decisions made on the basis of visual evidence – we call these perceptual decisions.
Vision begins at the photoreceptor cells in the retina, and the majority of the con-
sequent signals travel along the optic nerve to the lateral geniculate nucleus (LGN)
in the thalamus. The optic nerve also carries signals from the retinal ganglion cells
to the suprachiasmatic nucleus and pretectal nucleus, areas which are involved with
sleep and reflective eye movements, respectively. The LGN serves as a sensory relay
station by passing the visual stimulus to the primary visual cortex (V1), which is
located at the back of the brain in the occipital lobe1. The lateral geniculate nucleus
also serves as a processing station for input from several other sections of the cortex.
1Each hemisphere of the cerebral cortex is divided into four major lobes (frontal, parietal, tem-poral, occipital) by the folds and bumps which are characteristic of the surface of the brain. SeeFig. 1.1.
3
-
V1/V2
V4
MT/MST
IT
PPC
PFC
FEF
Figure 1.1: Figure reproduced and adapted from [Gra18]. The four major lobes of thebrain are color coded, with the frontal lobe in blue, the parietal lobe in yellow, thetemporal lobe in green, and the occipital lobe in pink. Signals from the retina travel,via the lateral geniculate nucleus, into area V1 of the visual cortex, and then travelalong one of two streams. Along the ventral stream, signals travel from V1 to V2 andV4 into the inferior temporal cortex (IT). The dorsal stream carries signals from V1to V2 into area MT and into the posterior parietal cortex (PPC). Both the ventraland dorsal streams contain projections to and from the prefrontal cortex (PFC). Alsonote the frontal eye field (FEF) which is a region of the anterior part of the PFC.
Area V1, within the visual cortex, is generally considered the first stage of visual
processing2, consisting of neurons with small receptive fields which form a precise
topographic covering of the entire field of view. The neurons in V1 may be tuned
to specific visual elements such as orientation, stereoscopic depth, and color. Out-
puts from the primary visual cortex project to secondary and tertiary areas which
themselves project to other visual areas in the parietal and temporal lobes.
2Actually the retina itself has been shown to perform a considerable amount of processing andadaptation to visual stimuli. For example, ambient light levels may vary several orders of magnitude,but the retinal ganglion cells have been shown to adapt to both image contrast (the range of lightintensities) and spatial correlations even when mean intensity is fixed [SBW+97]. Retinal ganglioncells have also been shown to adapt to moving stimuli and anticipate trajectories of moving objects[BBJM99]. See [GSBH09, SSSB11] for some more recent results and modeling work on processingin retinal ganglion cells.
4
-
From here, visual processing is organized into two main streams, called the dorsal
stream and ventral stream. In the ventral stream, signals pass from V1 through visual
areas V2 and V4 into the inferior temportal (IT) cortex. Neurons in the posterior IT
are tuned for stimulus features such as color or shape, whereas anterior IT neurons
are tuned for complex features like faces. In the dorsal stream, signals pass from V1
through the middle temporal (MT) area into the posterior parietal cortex. Neurons in
area MT respond to stimuli moving in specific directions, and lesion studies support
the idea that the signals carried by MT are important in planning eye movements
in perceptual tasks ([BB05] and see §1.1.2 below). Neurons in posterior parietal
cortex modulate visual responses related to the orientation of stimuli, and play an
important role in producing planned movements. The posterior parietal cortex has
also been shown to contain neurons which correlate with saccadic and limb movements
[DHP12]. Much of the output from the posterior parietal cortex then projects to the
frontal motor cortex, which is indicative of its role in response planning.
The posterior parietal cortex will appear in several places in this thesis. The
experiments studied in chapters 2 and 3 examine the lateral intraparietal area (LIP),
which is located within the intraparietal sulcus within the posterior parietal cortex,
which itself lies along the dorsal stream [RGMN10, OSBG06]. LIP’s role in attention
and planned saccadic eye movements is an area of active research, as we will see in
§1.1.2 below.
The two streams at this point converge by passing to the prefrontal cortex (PFC),
which has been linked to complex cognitive elements such as personality and task
representations [MC01]. We will study these task representations in Ch. 5, and give
an account for how overlap between them may explain certain limitations on cognitive
control capacity. One particular structure to note within the PFC is the frontal eye
field (FEF), which lies at the top of the PFC. The FEF is intimately linked with the
location of salient stimuli. Indeed, studies have shown that electrical stimulation of
5
-
the FEF evokes saccades, and that signals in FEF correlate strongly with saccades
to visual targets and may reflect the outcome of an automatic visual selection pro-
cess [Sch04, Sch02]. Numerous studies have also linked FEF to covert attention and
visual salience3 [KMK97, TB05, MTT08], although there is some contrary evidence
which shows that certain FEF neurons do not respond when monkeys perform tasks
that require attention with no eye movement [GB81]. These are still areas of active
research.
This description of the visual system as two cortical pathways was first presented
by Mishkin et al [MUM83] and is almost certainly a gross oversimplification of the
actual neural architecture behind how mammals formulate decisions and produce
choices based on visual information. The illustration is, however, very useful in
demonstrating that the areas with which we are concerned are only components of
the entire apparatus at hand. Our focus primarily lies along the dorsal stream which
contains the regions specifically linked to the formation of perceptual decisions and
attention. Among these regions, we are particularly interested in area LIP.
1.1.2 LIP’s role in perception and attention
In this subsection we trace the lineage of results which led to the experiments analyzed
in Ch. 2 and Ch. 3 [RGMN10, OSBG06]. This section is entirely focused on 2AFC
tasks, most of which are also aimed towards uncovering LIP’s role in decision making
and attentional tasks. We will first discuss a well known random dots motion task,
and the idea that LIP is responsible for the planning of eye movements. Subsequently
we discuss a more recent series of experiments focused on LIP’s role in attention.
3Overt attention occurs when a sensory system, such as the eye, orients itself to a target (e.g.saccade to a visual target). Covert attention occurs when we mentally focus on one of several possiblesensory stimuli.
6
-
LIP and the planning of eye movements in motion dots tasks
In this section, we hope to communicate a feel for why the random dots motion
stimulus is so useful for perceptual decision tasks, and how the study of intended
saccades (eye movements) has become a primary area of study in visual perception
and decision making. A good but older review of experiments implicating LIP as a
center for processing eye movements can be found in [ABM92]. It all begins with
the posterior parietal cortex, which contains LIP and has long been connected to the
processing of eye movements. In the early 1900s, [B0́9] showed that bilateral damage
to the posterior parietal cortex resulted in human subjects being unable to fixate their
eyes and unable to reach/grab objects. Later on in the 60s and 70s it was demon-
strated that electrical stimulation of the posterior parietal cortex produced saccades
[FC55, Ben64], and that lesions resulted in deficits in saccades [Sun79, LMTY77], al-
though the effects of lesions were not properly quantified until the 80s [LM89, KG88].
In the 70s, Andersen et al [AAC85] first identified and named area LIP after retro-
grade tracers in the frontal eye field and dorsolateral prefrontal cortex revealed strong
connections predominantly within the lateral bank of the intraparietal cortex. This
major finding was performed following key observations from Mountcastle and Lynch
[MLG+75, LMTY77], who first reported cells selective for saccades in the inferior
parietal lobule. Following this, several electrophysiological experiments showed that
most LIP cells were related to eye movements, with many responding before saccades
[BBF+91, ABB+90, GA88]. These experiments established LIP as a key region where
we might find signals regarding response preparation, intention, and attention. LIP’s
established position between the input sensory processing regions and the output mo-
tor execution (i.e. saccade) regions is a primary reason why it is so heavily studied
today.
The random dots motion task used in the experiment analyzed in Ch. 2 also takes
advantage of a series of results related to the visual processing of motion stimuli
7
-
along the dorsal stream. It has been established that area MT and the medial supe-
rior temportal area (MST) contain the predominant signals relevant to the cortical
analysis of visual motion. These regions lie along the dorsal stream, and within the
dorsal stream there seems to be a more specialized pathway of areas that are used
to discriminate visual motion information. The key feature shared by areas along
this specialized pathway is a high proportion of neurons selective to motion direction,
which we will henceforth call directionally sensitive for brevity. This pathway begins
in a sub-region of V1 called layer 4B, where experiments in the 70s and 80s discovered
a higher percentage of directionally sensitive cells relative to other sub-regions of V1
[Dow74, BF84, LH84, Mic85]. This region projects to area MT where over 80% of the
neurons are directionally selective [DZ71, MVE83, Alb84]. Another region considered
to lie along the dorsal stream, area V3, has also been found to consist of about 40%
directionally selective neuron[FVE87], and that area MT also receives some indirect
input from V1 through V2 and V3 [Mv83]. From MT there are many neuronal pro-
jections to area MST, which also contains a high percentage of directionally sensitive
neurons [VEMB81, DU86, THS+86].
In [NP88], Newsome and Pare established the connection between a dynamic
random dot display and the relevant signals in MT which represent the dynamic
perception of motion. Over the next few years, [BSNM92, BNS+96, SN96, SMBN92,
CA99, CN95] established that MT and MST do indeed contain the relevant signals for
formulating a decision based on visual information collected from the random dots
motion stimulus used in experiments like that of Rorie et al [RGMN10] which we
study in Ch. 2. These neurons are tuned so that if one computes averaged firing rates
over many trials, this averaged firing rate smoothly varies according to the amount
of energy in a band of velocities. This enabled experimenters to control the level of
difficulty of the perceptual tasks, and established the neural correlate which represents
this change in the amount of perceived motion. Still, there is ongoing research aimed
8
-
towards uncovering the complex and intertwined interactions among V1, MT, and
MST, and how signals from these areas are combined to form inputs to LIP, FEF and
other downstream visual areas. In addition to anatomical and electrophysiological
studies carrying on the work cited above [BW10, PHP+11], some studies like [PB00]
use cleverly constructed visual illusions to probe relationships among these areas.
The 2AFC random dots motion task was first used in its current form in [SBNM96,
SN01], where Shadlen et. al recorded from LIP during a 2AFC motion dots discrim-
ination task. Here monkeys had to indicate the direction of motion of coherently
moving dots embedded within a field of randomly moving dots via a saccade to a
fixed target along the direction of motion. LIP neurons were first identified using
the memory saccade task [HW83], in which monkeys are first required to fixate on a
certain point for about 100 msec. Then a second stimulus flashes pseudo-randomly
for about 300 msec at one of several locations in the monkey’s peripheral vision, while
the monkey maintains fixation. After a short delay period ( 400-700 msec), the mon-
key is required to saccade from the fixation point to the remembered location of the
flashed target, and the cell is identified with that particular receptive field if it demon-
strated persistent activity during the delay period. This procedure is commonly used
to identify certain cells to be studied in LIP.
Shadlen et al. identified cells using this memory saccade task, and oriented the
visual stimulus of their task so that one of the saccade targets was situated within the
recorded cell’s receptive field. They discovered that these LIP neurons exhibited firing
rates which predicted the saccadic eye movement that the monkey would make at the
end of the trial. We display averaged LIP recordings from Shadlen et al. [SN01] in
Fig. 1.2 as an illustration of the remarkable phenomenon. The experiment performed
by Rorie et al. which we have modeled and analyzed in Ch. 2 is an extension to the
experiment of Shadlen et al [SN01].
9
-
Figure 1.2: The average firing rates from 104 LIP neurons during the direction dis-crimination task of Shadlen et al. [SN01]. Solid and dashed curves are from trials inwhich the monkey judged direction toward and away the receptive field, respectively.Only correct trials are displayed. The various colors represent different strengthsof the random-dot motion, and we can see that the time course and magnitude ofresponse are affect by stimulus strength. Figure taken from Fig. 8 of [SN01].
LIP and the control of attention
The results in the previous subsection have established that LIP plays a role in the
execution of saccades during perceptual decision tasks, and in particular the random
dots motion task. However, the exact nature of LIP’s contribution to the processing
of visual information is still not fully understood. In the late 1970s it was discovered
that parietal neurons can also exhibit activity during fixation when a visual stimulus
appeared in a neuron’s receptive field, which indicated that these neurons may have a
role beyond the planning of saccades [RGS78]. Furthermore, when the visual stimulus
was made behaviorally relevant, these visual neurons exhibited even further elevated
activity, suggesting an attentional role for parietal cortex [RGS78, GB81].
Since then, several studies have demonstrated neural correlates for attention in
10
-
LIP [CG99, CDG96, RBK95, GG99, GKG98, PG00, BG03, GBP+02, Mes99]. In
particular, Gottlieb et al demonstrated that LIP cells can exhibit little or no increased
response to stimuli in their receptive fields unless the stimuli are behaviorally relevant
[GKG98]. Gottlieb and Goldberg also discovered that in an anti-saccade task, where
monkeys were required to saccade away from a visual target, the majority of LIP
neurons encoded visual stimulus location instead of the target of the intended saccade
[GG99]. It has also been established that chemical inactivation of LIP produces
deficits in both saccade target selection and covert attention[WOD02, WOD04], and
that a similar chemical inactivation can also produce deficits in attention [BG09], but
without entirely compromising task performance.
These results implicating LIP in attention and visual salience led Oristaglio et. al
to construct a task which involved a covert search for a visual target and a nonsaccadic
motor response [OSBG06]. Here both covert attention and a response was required,
and furthermore the motor response is dependent upon successfully detecting one
of two possible targets in an array of distractors, while maintaining fixation. The
recorded signals from LIP demonstrate interesting and puzzling activity which has
not been fully understood. In Ch. 3 we attempt to construct a model which fits this
data (along with that from [BOSG08]) in order to determine if this relatively complex
experiment helps us better understand LIP’s role in attention and perception.
Much of the work presented earlier in this section suggest that parietal neurons
in area LIP encode saccade motor decisions [SN01, NP88, SBNM96]. There is also
evidence that LIP carries signals of attention, or perceptual selection, which are
independent of the metrics, modality and reward of the required response [GB10].
In fact, Gottlieb et al. suggest that the attentional responses seen in LIP represent
a distinct type of decision that assigns value to sources of information rather than
specific actions.
This thesis is not primarily concerned with stating the precise role of LIP as a
11
-
either a center of attention or saccade motor decisions. Rather, we are interested
in how accumulator models may shed light on this discussion, as well as in other
areas of cognitive psychology and neuroscience. We demonstrate in several cases
that the computational and mathematical tools surrounding drift diffusion processes
(and closely related OU processes) can and should be used in understanding neural
activity and animal/human behaviors in these decisions tasks. Next, we will begin
our presentation of these tools by deriving the drift diffusion process from a simple
statistical inference task.
1.2 Building up: constructing drift diffusion (DD)
processes via statistical inference
Probably the most common way to understand the diffusion equation (Eq. 1.19
below) is seen in undergraduate physics and chemistry courses when one encounters
molecular diffusion of a chemical compound or gas. There one imagines particles in
an animated and irregular state of motion which is caused by frequent impacts of
particles on each side and/or on the surrounding medium, normally a fluid. Here
we consider a less physical basis, and instead derive the diffusion equation by first
trying to solve a very simple statistical inference task. Following this approach, we
build an alternative understanding of diffusion processes based upon the probabilistic
accumulation of evidence for two available choices. Much of the work presented in
this section and the next derives from the presentations in [BBM+06] and [GS07],
although a few of the mathematical details have been reworked.
1.2.1 A simple 2AFC statistical inference task
The most natural language with which to precisely describe decision making is that
of probability and statistics. This thesis only explicitly deals with decision processes
12
-
which solve two alternative forced choice tasks, or 2AFC for short4. These tasks
require a choice between two hypothesis h1 and h2, each of which represents a property
of the world that may be true or false (e.g. the array of moving dots mentioned in
§1.1.2 are moving left or right). 2AFC tasks are also considered forced in that the
subject cannot elect to not respond (this is typically enforced by penalizing the subject
more severely for a non response than even an incorrect response would have produced,
e.g. inflicting a time out in addition to lack of reward). In performing these tasks, we
also consider two distinct paradigms for how the subject must collect evidence and
respond: the free response and interrogation paradigms (in this context we will at
times use the words “paradigm” and “protocol” interchangeably). In the interrogation
paradigm, subjects are required to respond at a fixed deadline. This implies that
subjects have a fixed time epoch for which they may accumulate evidence, and that
their decisions must be made based on whatever evidence is collected during this
fixed amount of time. In the free response paradigm, subjects are allowed to respond
whenever they please. This implies that subjects need to determine the proper speed-
accuracy trade-off, because faster response times will increase the potential rate of
rewards but also typically decrease accuracy.
Before any evidence accumulation for either h1 or h2 takes place, however, one
must first consider the prior probabilities P(h1),P(h2). These represent the probabil-
ities that either hypothesis is true before obtaining any evidence. For the sensory-
motor tasks described above (e.g. moving dots task), the prior typically represents
the predicted probability of seeing a particular stimulus on the upcoming trial (e.g.
probability that subject is presented with left-moving dots), which may be explicitly
communicated to the subject or inferred from the relative frequency of a particular
hypothesis on previous trials. In Ch. 2, we analyze a case where the amount of reward
for a particular hypothesis is modulated trial-by-trial; indeed we find that manipu-
4Others often write NAFC for the N alternative version
13
-
lating rewards in this way effects monkey behavior in a manner similar to shifting
prior probabilities. Also, in Ch. 4 we fit some behavioral data collected by Simen et
al. [SCB+09] where these prior probabilities were manipulated and human subjects
were required to take this into account.
The subject, given his priors, now collects evidence from a stimulus and combines
all of his information (priors, evidence, expected rewards rewards) into a quantity
which we call the decision variable. Using this, the subject formulates a decision rule
which determines from the decision variable when and how to respond.
Throughout the past several decades, the most successful conceptual framework
with which one may study 2AFC tasks is signal detection theory [GS66]. In its
simplest form, the subject obtains one unit of noisy evidence e which is extracted
from an experimentally controlled stimulus. That is, the experimenter presents a
stimulus corresponding to either h1 or h2, and the subject obtains some noisy evidence
e by observing the experimental event. The subject now needs to determine which of
two conditional probability distributions P(e|h1) or P(e|h2) gave rise to the observed
evidence e.
In the case of two alternatives studied here, a simple choice for the decision variable
is the ratio between the two relevant likelihoods P(e|h1) or P(e|h2):
l12(e) =P(e|h1)P(e|h2)
. (1.1)
The quantity l12 is often referred to as the likelihood ratio. The subject now bases his
decision on the likelihood ratio l12 by choosing some threshold β, and elects to choose
h1 if l12 ≥ β and h2 if l12 < β (if l12 = β, we say that the subject arbitrarily chooses
h1).
The critical question is: how do we choose β? Suppose we wish to maximize the
14
-
overall accuracy and that P(h1) = P(h2). The overall accuracy is written as
P(H1|h1) + P(H2|h2) , (1.2)
where H1 (H2) is the event that the subject chooses hypothesis h1 (h2). Since subjects
are forced to respond, P(H1|h2)+P(H2|h2) = 1 (also P(H1|h1)+P(H2|h1) = 1), which
means that maximizing Eq. 1.2 is equivalent to maximizing
P(H1|h1)− P(H1|h2). (1.3)
If we designate a region Λ as all of the evidences (point events) which lead to the
acceptance of h1, then the probability that h1 is accepted when h1 is true can be
written as
P(H1|h1) =∑
e∈ΛP(e|h1) , (1.4)
and the probability of an incorrect acceptance of hypothesis h1 is
P(H1|h2) =∑
e∈ΛP(e|h2). (1.5)
The task of maximizing Eq. 1.2 is now the task of selecting a set of points Λ so that
we maximize∑
e∈ΛP(e|h1)− P(e|h2). (1.6)
Now consider some sample of evidence ē. The point event ē should be included in
Λ if and only if its inclusion contributes positively to this sum of Eq. 1.6, i.e.
P(ē|h1)− P(ē|h2) ≥ 0. (1.7)
This implies we want to include in Λ only the observations where this inequality holds,
15
-
which means that a point event ē should be included in Λ if and only if we have
P(ē|h1)P(ē|h2)
≥ 1 . (1.8)
This defines our acceptance region Λ for hypothesis h1 as all sample evidences e for
which the likelihood ratio l12(e) ≥ 1, which means β should equal 1. To summarize,
in order to maximize the overall accuracy Eq. 1.2 given a sample of noisy evidence
e, we should choose hypothesis h1 if and only if l12(e) ≥ β, where β = 1.
The above argument easily generalizes to the case where one wishes to maximize
the weighted accuracy
P(H1|h1) + aP(H2|h2) . (1.9)
Again we can rewrite this as trying to maximize P(H1|h1) − aP(H1|h2) and observe
again that we want to select a region Λ so that∑
e∈Λ P(e|h1) − aP(e|h2) is as large
as possible. The only point events which contribute positively to this sum are those
events ē for which we have
P(ē|h1)− aP(ē|h2) ≥ 0, orP(ē|h1)P(ē|h2)
≥ a . (1.10)
This means the proper acceptance region for the hypothesis h1 is all of the events
whose likelihood ratios equal or exceed a, which shows that in order to maximize the
weighted accuracy Eq. 1.9 we simply set β = a. Knowing this, we will now write the
weighted accuracy as
P(H1|h1) + βP(H2|h2) , (1.11)
since maximizing this form immediately tells us the value of β that should be set.
This result actually shows us how to choose β for many different useful cases,
most notably the case where one wishes to maximize expected value. Suppose we are
given costs and rewards for all four possible trial outcomes:
16
-
r11 : amount of reward for correctly choosing h1,
r21 : amount of penalty for incorrectly choosing h1,
r22 : amount of reward for correctly choosing h2,
r12 : amount of penalty for incorrectly choosing h2.
The expected reward is
E(R) = r11P(h1)P(H1|h1) + r22P(h2)P(H2|h2)
−r21P(h2)P(H1|h2)− r12P(h1)P(H2|h1).(1.12)
Note that there are no assumptions on the priors P(h1),P(h2). We can rework Eq.
1.12 into a form of Eq. 1.9:
E(R) = r11P(h1) (1− P(H2|h1)) + r22P(h2)P(H2|h2)
−r21P(h2) (1− P(H2|h2))− r12P(h1)P(H2|h1)
≃ (r22 + r21)P(h2)P(H2|h2)− (r11 + r12)P(h1)P(H2|h1)
≃ (r22 + r21)P(h2)P(H2|h2) + (r11 + r12)P(h1)P(H1|h1)
≃ (r22+r21)P(h2)(r11+r12)P(h1)
P(H2|h2) + P(H1|h1) ,
(1.13)
where ≃ indicates equivalence up to a constant factor (i.e. equivalent maximization
problems). We see that in order to find the acceptance region for hypothesis h1
which maximizes the expected reward, we need to choose hypothesis h1 whenever the
likelihood ratio l12(e) exceeds
β =(r22 + r21)P(h2)
(r11 + r12)P(h1). (1.14)
In particular, when r22 + r21 = r11 + r12, this reduces to the case where we wish to
maximize overall accuracy, which means we set
β =P(h2)
P(h1); (1.15)
17
-
so, if hypothesis h1 is more likely, it takes a smaller value of l12 to decide upon choosing
h1. Finally, in the case where both hypotheses are equally likely we recover β = 1 as
demonstrated above in Eq. 1.8.
The formulation presented in this section can be thought of as how a subject may
compute a decision in a 2AFC task given only one unit of evidence. However, more
realistic tasks usually demand that subject incorporate multiple pieces of evidence
over some time epoch. How does one properly adjust their decision variable in order
to accommodate for multiple pieces of evidence? This question is answered in the
next section by the Neyman-Pearson lemma and the sequential probability ratio test.
1.2.2 The Neyman-Pearson lemma and the sequential prob-
ability ratio test
We now study the situation in which a subject is successively presented with pieces
of evidence. First, we study the interrogation paradigm, in which subjects are given
a fixed amount of evidence {e1, . . . , eN} and are required to identify if these are
sampled from the distribution corresponding to hypothesis h1 or that corresponding
to hypothesis h2. The natural adjustment to our decision variable from above is to
consider the likelihood ratio of all of the data,
log [LR12] = log
[
P(e1, e2, . . . , eN |h1)P(e1, e2, . . . , eN |h2)
]
=N∑
j=1
log
[
P(ej|h1)P(ej|h2)
]
. (1.16)
where we have taken a logarithm to let us work with sums instead of products.
We have also assumed that the data are independent, an assumption which is usually
enforced by the experimentalist. Eq. 1.16 will often be referred to as the log-likelihood
or log-likelihood ratio. The decision rule is then essentially equivalent to that from
above: if logLR12 ≥ Θ for some threshold Θ ∈ R, then accept hypothesis h1, and if
logLR12 < Θ, accept hypothesis h2.
18
-
This widely used likelihood-ratio test was proven to be optimal by Neyman and
Pearson in 1933 [NP33]. The Neyman-Pearson lemma states that the likelihood-ratio
test which accepts h1 if and only if logLR12 ≤ Θ is the most powerful statistical test
of its size. Size here refers to the false alarm probability or error rate: α = P(H2|h1),
where H2 refers to the event that hypothesis h2 is chosen according to this decision
rule. There are several more updated proofs of the Neyman-Pearson lemma in the
statistical literature [GS66, Gho70, Leh59].
The Neyman-Pearson lemma implies that if Θ = log[1] = 0, then the likelihood-
ratio test above maximizes accuracy by delivering the most likely hypothesis and
minimizing the total error probability. Thus, the Neyman-Pearson lemma is optimal
for the interrogation paradigm.
The case of the free response paradigm is less straightforward. These 2AFC tasks
require the subject to consider two elements: the decision between the hypotheses h1
and h2, and the decision about whether to either respond or continue accumulating
evidence5. More precisely, after collecting the nth piece of evidence en, the subject
determines whether or not to immediately execute his decision based on e1, . . . , en,
or to wait and collect en+1. After collecting the nth piece of evidence, our decision
variable is essentially the same as before, but now the log-likelihood ratio needs to be
updated for each new piece of evidence so that after n observations,
log [LR12(n)] = log
[
P(e1, e2, . . . , en|h1)P(e1, e2, . . . , en|h2)
]
=n∑
j=1
log
[
P(ej|h1)P(ej|h2)
]
. (1.17)
A sensible choice of decision rule is to update logLR12 after each new unit of evidence
5Such language makes us think of the trade-off between exploration and exploitation. In ourcase, the subject is faced with a similar explore vs exploit decision at each moment of evidenceaccumulation. Indeed, the subject may choose to explore by waiting and collecting more evidence, orthe subject may exploit the collected evidence by choosing either h1 or h2 and responding accordingly.In general the trade-off between exploitation and exploration is poorly understood: even when theobjective functions are well specified there may still not be a known optimal policy for trading offbetween explore vs. exploit. Here we have a simple situation in which we may derive the solutionfor a nontrivial explore-exploit task.
19
-
is collected, and to ask whether or not logLR12 has crossed some positive or negative
number. Hypothesis h1 is accepted if logLR12 is greater than some positive threshold
Z0, and h2 is accepted if logLR12 drops below some negative threshold Z1. This
choice of decision rule applied to LR12 from Eq. 1.17 is what comprises the sequential
probability ratio test (SPRT).
Barnard [Bar46] and Wald [WW48, Wal04] independently showed that the SPRT
is optimal for the free response paradigm in the sense that given a fixed level of
accuracy that must be attained on average, the SPRT requires the smallest number
of samples to settle on a decision. Other proofs of the optimality of the SPRT may
be found in [GS66, Gho70, Leh59].
1.2.3 The continuum limit of the SPRT
When considering the SPRT, one often thinks of the decision variable as a discrete
random walk, where the log likelihood ratios of independent samples of evidence
produce independent increments at each time step. Writing log[LR12(n)] = xn for
simplicity, we have
xn = xn−1 + logP(en|h1)P(en|h2)
. (1.18)
The SPRT is performed by choosing some initial value for x0 and incrementing ac-
cording to Eq. 1.18 until xn crosses some positive threshold Z0 or negative threshold
Z1. The three values x0, Z0, and Z1, encode the priors for the hypotheses h1 and
h2, as well as any reward biases that are present. Because the SPRT is unchanged
if we shift x0, Z0 and Z1 by a constant value, we may assume that the thresholds
are symmetric, that is Z0 = Z and Z1 = −Z for some number Z. The way in which
thresholds and initial conditions are to be chosen is carefully studied in [BBM+06]. In
the case of unbiased stimuli and equal rewards, the SPRT can bee seen as a discrete
random walk with x0 = 0 and continuing according to Eq. 1.18 until it crosses some
20
-
positive threshold +Z or negative threshold −Z.
As discrete samples are taken more and more rapidly, the discrete random walk
Eq. 1.18 approaches a continuous time variable X(t) after proper technical consider-
ations are made. The details of this limiting procedure are covered in appendix A of
[BBM+06], where it is shown that the continuum limit of SPRT then produces the
drift diffusion model (DDM)
dX = Adt+ σdW, X(0) = x0, (1.19)
where X is the state of accumulated evidence for a particular choice, Adt is the aver-
age increase in evidence per unit time (drift), and the diffusion term σdW represents
Gaussian distributed white noise with mean 0 and variance σ2dt. It is important
to note that A and σ depend upon specific properties (i.e. means and variances) of
the distributions from which the samples are drawn, and that in our form we have
implicitly assumed we are sampling from Gaussian distributions. When X(t) crosses
either of the thresholds ±Z, accumulation halts and a decision is made (accepting h1if +Z is crossed and h2 if −Z is crossed). Often we use another parameter called the
non-decision time T0 to account for any sensory delays before any evidence accumu-
lation is made, and motor delays prior to the response. This is introduced to avoid
arbitrarily small reaction times which are impossible in real data because of the time
it takes for signals from the retina to travel to the relevant decision areas of the brain,
and for the execution of motor actions.
The DDM was introduced into the psychological literature in [Rat78] as a proposed
theory for memory retrieval. We sometimes refer to the DDM as presented in this
section as the pure DDM to distinguish it from the extended DDM that we summarize
in §1.5.1. In Ch. 4 we construct a version of the DDM involving non-Gaussian noise,
and compare its data fitting power with that of the extended DDM.
21
-
The pure DDM can also be used to solve a continuous time version of the interro-
gation paradigm by supposing that at the interrogation time T , the subject responds
according to the sign of X(T ). If X(T ) > 0, then the subject accepts hypothesis h1,
and if X(T ) < 0, the subject accepts hypothesis h2.
When working with the DDM, we can write down explicit formulae for the mean
decision times and error rates which can be used to describe overall model behavior.
These details are presented in §1.3.3 when we restate some relevant formulas for the
Ornstein-Uhlenbeck model, of which the DDM is a special case.
The optimality of the DDM in both the interrogation and free response paradigms
can be heuristically justified by recalling that it is the continuous time limit of either
the SPRT or likelihood ratio test from §1.2.2, both of which have been shown to
be optimal. Furthermore, [BBM+06] gives other direct arguments establishing the
optimality of the DDM, and also provides formulas for how to optimally set the
thresholds as well as how to modify the pure DDM presented here for biased priors.
Although these results are indeed very useful, we choose to not review them here.
The reader is encouraged to refer to appendix A of [BBM+06] for a clear exposition
of these results.
1.2.4 Relevant experimental results concerning drift diffu-
sion processes and optimality
The various ratio tests and the DDM presented above provide us with a good set
of tools for describing and fitting data for various 2AFC tasks. One then wonders
if these statistical tests represent how animal and human brains actually behave.
Not surprisingly, animals do not always behave optimally. One famous example is
the matching law first noted by Herrnstein in 1961 [Her61, Her70] which is reviewed
in [Her97]. Herrnstein showed that pigeons would often choose at only twice the
frequency a button which yielded twice as much reward, where an optimal pigeon
22
-
would choose the higher rewarded button 100% of the time. There are also several
cases of humans performing suboptimally (e.g. [LS91, LT89]).
Yet in tasks where subjects perform suitably constrained 2AFC tasks with suf-
ficient training, the DDM seems to provide good fits to both monkey and human
behavioral data [BHHC10, BBM+06, BHHC10, SCB+09, RM07, GP08, RGMN10,
BSN+11, RHH+07, RCS03]. Even more impressive are the neurophysiological con-
nections that have linked the DDM process from Eq. 1.19 to neural activity. For
example, Hanes and Schall demonstrated that neurons in the frontal eye field exhibit
activity which resembles the drift rate of a diffusion process [HS96], and others have
contributed further evidence from in vivo recordings in monkeys that occulomotor
decision making in the brain mimics a DDM, with neural activity rising to a thresh-
old before movement initiation [Sch01, MRDS03, GS01, SR04]. In the case of the
random dots motion task, LIP recordings are believed to represent accumulating evi-
dence in neurons having receptive fields containing the response target corresponding
to a particular alternative, as shown in Fig. 1.2. Differences between the firing rates
corresponding to the two different directions of motion then appear to behave like
sample paths of the DDM [RS02]. It should be noted, however, that the DDM has
only been shown to fit the evidence accumulation period during these dots motion
tasks. For other phases of the trial where evidence is not being accumulated, the
DDM does not necessarily fit the neural data well. An example of this is in the neu-
ral data shown at the end of Ch. 2 – the LIP recordings during the reward period
before the accumulation phase do not look at all like DDM processes. Furthermore,
the neural data for a different task in Ch. 3 shows many features that a straightfor-
ward application of the DDM would not be able to fit. There, we construct a model
which incorporates integrate-to-threshold units with other types of units in order to
fit the neural data.
These various experimental results show that we can find signals in the brain which
23
-
represent the decision variables in a statistical test. But how does the brain assemble
these signals? How might neurons actually perform the computations required for
evidence accumulation and hypothesis testing? In the next section we summarize
a series of biophysically-motivated modeling results which begin to systematically
connect single cell spiking models to the DDM.
1.3 Working down: Reducing a biologically plau-
sible model to an OU process
Throughout the animal kingdom, brains can contain up to billions of neurons (with
humans having ≈ 1011 neurons), each of which are nontrivial to model. Perhaps the
most detailed dynamical models we have of neurons are those inspired by the Hodgkin
and Huxley, where one may construct large systems of ODEs and PDEs which approx-
imate the specific ionic currents at specific locations along a neuron’s axon, cell body,
or dendritic tree ([HH52b, HHK52, HH52a, HH52d, HH52e, HH52c, GC10, DA05]).
These models have been very successful at reproducing the key spatiotemporal prop-
erties of single cells, most important of which is that of an action potential. However,
such models are analytically intractable and relatively expensive to simulate, and
if one wishes to model networks of even hundreds of neurons, the systems quickly
become computationally infeasible.
A common simplification of these multiunit compartmental models is the integrate
and fire model. Here, one does not explicitly simulate the time course of an action
potential, but rather only models the subthreshold voltage potential up until some
threshold. When the threshold is reached, a delta function spike occurs and the
voltage is reset to its resting potential. Various flavors of these integrate and fire
neurons have been used to simulate models of tens of thousands of neurons [IE08,
Izh03], and one simulation by Izhikevich explicitly modeled 1011 neurons and almost
24
-
1015 synapses (although this simulation of 1 second of real time took 50 days on 27
3GHz processors).
The key to all of these approaches is that they aim to directly model neurons,
which are the basic building blocks of the brain. Any results about the overall network
behavior then need to emerge from the explicit modeling of single neurons and their
connections.We say that models constructed in this way are biophysically-based or
biophysically-motivated.
In this section we wish to present how the DDM can also be viewed as a reduction
from biophysically-based models for perceptual decision making. We begin in §1.3.1
with an overview of some biophysically-based models and illustrate their connection
with a popular model called the Leaky Competing Accumulator (LCA) model. Then
in §1.3.2 we show how the LCA can be reduced, given certain parameter ranges and
assumptions, to an OU model, of which the DDM is a specific case.
Throughout this development, the key point is to see the DDM in a different way,
this time as a simplified version of a biophysically motivated model. This gives us
the justification for considering the DDM as a plausible model for neural activity.
Furthermore, this reduction from realistic models also allows us to further justify the
link between the DDM and the related neural activity that was reviewed in §1.2.4.
1.3.1 From spiking neurons to the Leaky Competing Accu-
mulator model (LCA)
Over the past decade, researchers have been able to demonstrate in computer sim-
ulation that biophysically realistic cortical networks of spiking neurons are able to
reproduce the behavioral dynamics of perceptual decisions. In 2002, Wang presented
a model of area LIP by simulating spiking neurons and drawing neural and synap-
tic properties of the model from anatomical and physiological observations [Wan02].
Subsequent studies have further demonstrated how this biophysically-based spiking
25
-
model reproduces experimentally observed behaviors in perceptual decision tasks.
These studies have been able to connect many of these observations with biological
properties of the spiking neurons, although rigorous reduction theorems are lacking,
except for local bifurcations [WW06, WHMNW07, EWLH11, RL08, GH83].
A looser connection between Wang’s model and accumulator models was estab-
lished by Bogacz et al in [BBM+06]. It is still not completely understood how one
may reduce a detailed neural network model to a noisy connectionist unit, but by
assuming such a reduction Bogacz et al. showed that an averaged version of Wang’s
model in [Wan02] may be viewed as a biologically realistic implementation of the
Leaky Competing Accumulator model of Usher and McClelland [UM01]. We also
note that another biologically inspired model of perceptual choice which has been
reduced to the LCA [BUZM07, BBM+06] is that of Mazurek et al. [MRDS03].
The LCA model is a two-dimensional system of stochastic differential equations
whose state variables (x1(t), x2(t)) describe the activities of two mutually-inhibiting
populations of neurons, each of which receives noisy sensory input [UM01, McC79]:
dx1 = [−γx1 − βf(x2) + I1(t)]dt+ σdW1 ,
dx2 = [−γx2 − βf(x1) + I2(t)]dt+ σdW2 ,(1.20)
where f(·) is a sigmoidal type activation function, γ and β, respectively, denote the
strengths of leak and inhibition, and σdWj are independent white noise (Weiner)
increments of r.m.s. strength σ. The inputs I1(t), I2(t) are generally time dependent
signals which vary over the course of a single decision process (i.e. trial).
Under the interrogation paradigm the choice is determined by the difference x(t).=
x1(t) − x2(t): at interrogation time T if x(T ) ≥ 0 the hypothesis corresponding to
stimulus I1 is chosen, and vice versa for x(T ) < 0. Similarly, for the free response
paradigm we set positive and negative thresholds for the decision variable x(t), and
when a threshold is crossed the corresponding decision is made.
26
-
1.3.2 Reduction of LCA to an Ornstein-Uhlenbeck (OU) pro-
cess
It has been shown that under suitable conditions that the LCA model from above
can be reduced to a one-dimensional Ornstein-Uhlenbeck model, which is a simple
generalization of the DDM [BBM+06, BGH+05, BH01, FHRN09]. Here we present
the reduction from [FHRN09], which demonstrates how the noiseless version of the
LCA reduces to an OU process in the presence of constant inputs. We present this
noise-free reduction for clarity, so that the technical tools involving stochastic center
manifolds [Box89, Box91, AJM+95] need not be developed here. When the inputs
Figure 1.3: Illustration showing nullclines and fixed points for the LCA, Eq. 1.20.The two curved lines represent the nullclines, and their intersections are the threefixed points. The two outermost fixed points are stable, and the middle fixed point isa saddle. The two stable fixed points shown represent relatively high activity in onepopulation versus the other, whereas the saddle point represents moderate activity inboth populations. The dashed diagonal line represents the slow attracting manifold.
I1, I2 are constant and σ = 0, equilibrium solutions of Eq. 1.20 lie at the intersections
of the nullclines given by γx1 = −βf(x2) + I1 and γx2 = βf(x1) + I2. Depending
27
-
upon the precise form of f(·) and the parameter values I1, I2, β, γ, there may be
one, two, or three equilibrium points. Each equilibrium point corresponds to either
moderate activity in both populations or relatively high activity for one population
versus the other, as illustrated in Fig. 1.3. Also shown in Fig. 1.3, if these nullclines lie
sufficiently close to each other over the activity range that encompasses the equilibria,
it follows that a one-dimensional, attracting, slow manifold exists that contains both
stable and unstable equilibrium points, and the solutions that connect them [GH83,
BH01].
To demonstrate these ideas more clearly, we first linearize the sigmoidal activation
function in Eq. 1.20 at the central equilibrium point (x̄, x̄) in the case of equal inputs
I = I1 = I2, where x̄ =1γ[−βf(x̄) + I]. After parameterizing the sigmoid f(·) so
that dfdx(x̄) = 1, the linearized system can be written as
dx1 = [−γx1 − βx2 + I1(t)]dt+ σdW1,
dx2 = [−γx2 − βx1 + I2(t)]dt+ σdW2,(1.21)
and subtracting these equations yields a single scalar SDE for the differenced activity
x(t) = x1(t)− x2(t):
dx = [λx+ A(t)] dt+ σdW, (1.22)
where λ = β − γ, A(t) = I1(t)− I2(t) and dW = dW1 − dW2 are independent white
noise increments. This is an Ornstein-Uhlenbeck (OU) process.
Eq. 1.22 describes the decision variable x(t) for our OU model. In the same way
as before, in the interrogation paradigm the sign of x is used to determine the choice
of response at interrogation time T . In the free response paradigm, a choice is made
when x crosses either a positive or negative threshold.
The stimulus differences A(t) represents the sensory evidence for a particular
hypothesis. For example, if a stimulus corresponding to hypothesis 1 is displayed,
28
-
this means that A(t) = I1 − I2 > 0, and a correct response will occur if the top
boundary is hit before the bottom boundary. Errors will occur when noise pushes the
x to the bottom (incorrect) boundary before the top (correct) is reached, and they
will be more likely when I1 and I2 are close, i.e. when the inputs for the alternatives
are hard to distinguish in the presence of noise.
A particularly important case is when the leak γ and inhibition β are perfectly
balanced, so that λ = β − γ = 0. In this case, Eq. 1.22 is equivalent to the drift
diffusion process of Eq. 1.19, and we recover the DDM from §1.2.3.
1.3.3 Working with the OU and DDM models
Here we present known expressions for error rates and reaction times which describe
overall OU (and DDM) model behavior. These quantities can then be compared with
experimental data, as is done in Ch. 2, or they can be used to compute reward rates,
as done in Ch. 5.
The first quantity we note is the probability distribution of solutions for Eq. 1.22,
which is governed by the forward Kolmogorov or Fokker-Planck equation [Gar85].
∂p
∂t= − ∂
∂x[(λx+ A(t))p] +
σ2
2
∂2p
∂x2. (1.23)
A variety of insightful derivations of the Fokker-Planck equation can be found in
several good sets of lecture notes and textbooks [Gar85, Ris96, Fel66, Cro06]. Given
initial conditions, we can solve Eq. 1.23 analytically or numerically to compute the
probability distribution of solutions of Eq. 1.22. In Ch. 2 we study solutions of Eq.
1.23 for the interrogation paradigm.
In [BBM+06], one may find closed form expressions for the mean error rate and
decision time for the OU model with constant drift rate A. The reader is referred
to Appendix A of [BBM+06] for these expressions, they are too cumbersome to be
29
-
included here and are not explicitly used in this thesis.
For the DDM, however, the expressions for the mean error rate and decision time
take on simpler forms which we state here for reference. Using techniques presented
in [BT92, BT93, Gar85, BBM+06] for computing first passage times, it can be shown
that the mean error rate 〈ER〉 and mean decision time 〈DT 〉 can be written as
〈ER〉 = 11 + e2z̃ã
− 1− e2x0ã
e2z̃ã − e−2z̃ã (1.24)
〈DT 〉 = z̃tanh(z̃ã) + 2z̃(1− e−2x0ã)
e2z̃ã − e−2z̃ã − x0. (1.25)
These expressions will be used in chapters 2 and 5.
1.4 Thesis overview
The OU Model and the particular case of the DDM are the crucial building blocks of
the modeling efforts in this thesis. In essence, this thesis is a further demonstration
of how these accumulator models are useful for studying a variety of psychological
phenomenon including, but not limited to, perceptual decision making.
In Ch. 2, we adapt the OU model to fit a more generalized version of the motion
dots task where monkeys are now faced with biased rewards. By doing so, we demon-
strate that monkeys shift their behaviors in a systematic way, and that they do so in
a near optimal manner. We also show some fits of the OU model to electrophysiolog-
ical data and find that λ ≈ 0, giving some further evidence that the DDM is a good
model for both the behavior and neural activity related to perceptual choice.
In Ch. 3, we construct a multi-unit noisy connectionist model for the covert search
task mentioned in §1.1.2 [OSBG06]. We first reanalyze the data and discover some
new trends that were not fully established before, and then systematically construct a
model which explains the key behavioral and electrophysiological phenomena demon-
30
-
strated in the experimental data. Our model proposes that LIP plays more of an
attentional role in this covert search task – in fact LIP is not even necessary for the
task, although its inclusion aids performance. Our model is shown to come close in
some cases to the data, although there is still work left to be done.
In Ch. 4 we study a different generalization of the DDM, this time assuming that
the noise can have jumps in addition to Weiner increments. We demonstrate that
this simple change in the distribution of noise allows the pure DDM to reproduce fast
error trials given unbiased initial data, something which other models require more
parameters to reproduce. We fit our model to human subject data and compare it
with the extended DDM presented in §1.5.1.
In Ch. 5 we address the more general question of why humans, despite having a
prefrontal cortex with billions of neurons, can only perform a few number of tasks at
once. We construct an abstract model for studying capacity constraints on cognitive
control and use the DDM as a generalized model for a task. After studying various
aspects of the constructed model, large scale simulations demonstrate that a capacity
constraint does indeed arise out of the need for optimizing overall rewards.
We conclude the thesis with a short summary essay in Ch. 6.
1.5 Appendix: Two popular models for decision
making
In this section we review two models which deserve mention. First we will review
a useful extension of the pure DDM, namely the extended DDM, first presented by
Ratcliff and Rouder in [RR98] as a model for 2AFC data. Then we summarize the
Linear Ballistic Accumulator Model of Brown et al., which is a model of rapid choice
which does not use stochastic accumulation like the DDM (and extended DDM).
31
-
1.5.1 The Extended DDM of Ratcliff
In many situations, the simple pure DDM as presented in the main text is not sufficient
to fit reaction times. For example, one key feature often seen in behavioral 2AFC data
is that error trials and correct trials demonstrate significantly different reaction time
distributions. In [RR98], Ratcliff et al. presented the extended DDM which adds 4
new parameters to the pure DDM in order to fit four different sets of behavioral data.
Recall that the pure DDM from above already contains 5 parameters: x0, Z, A, σ, and
T0. Over the years, many reports have established that the extended DDM is able to
fit behavioral 2AFC data well [RT02, RS04, RM07, SCB+09].
x0
T0
0
-z
z
st
sx
Mean drift:
Time
A
Sample path
Std. dev. of sample path
positions at time t:
DDM first-passage time density
(conditioned on the event of an
upper boundary ( z ) crossing).
Stimulus onset
Trial-to-trial
Std. dev. of drift: sA
RT
c √t
Figure 1.4: Extended DDM illustration from [SCB+09]. The notation here differsfrom that used in the original extended DDM paper [RR98].
The four additional parameters to the pure DDM which were introduced in [RR98]
are variability in starting point σx, variability in drift rate σA, variability in non-
decision time σT , and a proportion of contaminant reaction times p0. For each trial,
the initial condition x(0) was then sampled from a uniform distribution, U [x0 −
32
-
σx, x0 + σx], and the drift rate A was sampled from a Gaussian distribution with
mean A and standard deviation σA. The non-decision time T0 was also sampled from
its own uniform distribution U [T0−σT , T0+σT ]. Finally, each trial had a probability
p0 of being a contaminant trial. On these contaminant trials, a random response was
made, and the reaction time was not generated by the diffusion process but instead
was drawn from a uniform distribution spanning the observed RT range from the
data. The extended DDM is illustrated in 1.4.
1.5.2 The Linear Ballistic Accumulator Model
One recent model for response times is the Linear Ballistic Accumulator [BH05, BH08]
which has seen several applications as a model of rapid choice [DABH09, FDB+08,
HBS09, LFEG09]. The basic idea is shown in Fig 1.5. Here, two accumulators are
used, one for response A and another for response B. The activity of an accumulator
is initialized to a value uniformly sampled from [0, A]. Then, the drift rate for the
first accumulator is drawn from a normal distribution with mean dA and standard
deviation σ. The drift rate for the second accumulator is also drawn from a normal
distribution with mean dB and standard deviation σ. Then for each accumulator,
evidence accumulates in a noiseless (ballistic) and linear fashion according to their
respective drift rates. The various accumulators race to their respective thresholds,
and whichever hits threshold first executes the corresponding response. Like the
diffusion models, a non-decision time T0 is typically included. In [DABH09], Donkin
et al. performed a detailed numerical comparison between the Extended DDM and
the LBA and conclude that inferences about psychological processes made from real
data are unlikely to depend on the model that is used. The LBA has the advantage
of having one less parameter when compared to the extended DDM, whereas the
extended DDM has the principled derivations as presented in sections 1.2 and 1.3
(although the LBA is derived in [BH05] from a simplified deterministic version of
33
-
Figure 1.5: Simplified representation of the Linear Ballistic Accumulator model ofBrown and Heathcote. Reprinted from [BH08], with permission from Elsevier.
the LCA of [UM01]). Furthermore, Goldfarb and Caicedo (personal communication)
have preliminary evidence that the LBA has difficulty staying close to the optimal
performance curve, which represents the relationship between error rate and decision
time that holds under the optimal threshold set in the DDM.
34
-
2 Can monkeys choose optimally when
faced with noisy stimuli and unequal
rewards? 1
This chapter focuses on modeling the behavior of monkeys performing an extended
version of the motion dots discrimination task in which the amount of reward for one of
two alternatives may be doubled (if correctly selected) on certain trials. The monkeys
are informed of a particular trial’s reward structure by a visual cue, illustrated in the
“Reward” column of Fig. 2.1 and described below. We propose extensions of the
OU process presented in §1.3.3 to account for the influence of unequal rewards, and
make use of the derived psychometric functions (PMFs) which link model predictions
to monkey behavior. The PMFs are characterized by two parameters: midpoint
slope, which quantifies the subject’s ability to extract signal from noise, and shift,
which measures the bias applied to account for unequal rewards. We fit these PMFs
to data collected from two adult rhesus monkeys and find that, when behavior is
averaged over multiple sessions, the monkeys shift their PMFs in a nearly optimal
manner; remarkably, monkeys respectively garner greater than 98% and 99% of their
maximum possible rewards. We also present a simple fit to the electrophysiological
data during the accumulation period of the task in order to estimate some of the OU
1This chapter shares its title with a paper written by Samuel Feng, Alan Rorie, Philip Holmes,and William T. Newsome [FHRN09]. Most notable is Alan Rorie, who was primarily responsible forperforming these extensive and thorough monkey experiments.
35
-
parameters.
We describe the experimental task in §2.1. §2.2 briefly reviews material from
§1.3, reducing the LCA model to an OU process and demonstrating how the resulting
PMFs change with experimental parameters. In §2.3 we compute the optimal shifts
of these PMFs due to biased reward conditions and mixed coherences between trials.
In §2.4 we demonstrate that the predicted PMFs from the OU model fit the data if
averaged over multiple sessions, and we compute how close the monkeys are to optimal
performance. We also present some work with the individual session data. In §2.5
we present some fits of the OU model to the averaged firing rate data during the
accumulation period and find that our OU process is approximately a drift diffusion
process. The chapter concludes with some future problems and discussion in §2.6.
2.1 Unequal rewards in the motion dots task
Figure 2.1: Diagram indicating the time course of cues during the biased rewardsmotion dots task. See text for details.
The full details of the behavioral study including equipment used, training of
monkeys, setup of experimental apparatus, and recording of electrophysiological data
can be found in [FHRN09]. The experiment was performed at Stanford University
by Alan Rorie, under the supervision of W.T. Newsome. We are indebted to them
36
-
for sharing their data.
Here we focus on the details presented in Fig. 2.1, which illustrates the sequence
of events which formed a typical trial. First, a small, yellow dot appears and the
monkey is required to fixate upon it for 150 msec. Next, two saccade targets appear
(open gray circles) on opposite sides of the fixation point and aligned with the axis of
motion to be discriminated. By convention, target 1 (T1) corresponds to a positive
coherence stimulus (right-going motion), and target 2 (T2) to negative coherence
stimulus (left-going motion). After 250 msec the targets change color, indicating the
magnitude of reward available for correctly choosing either target. A blue target
indicates a low magnitude (L) reward of 1 unit (1 drop) of juice, while a red target
indicates a high magnitude (H) reward of 2 units of juice. We denote by r1 ∈ {1, 2}
the magnitude of reward available for correctly choosing T1, and similarly define r2.
This yields four reward conditions overall, shown in the 4 panels of the “Reward”
column of Fig. 2.1: (1) LL (r1 = r2 = 1), in which both targets are blue, (2) HH
(r1 = r2 = 2), in which both are red, (3) LH (r1 = 1, r2 = 2), in which T1 is blue
and T2 is red, and (4) HL (r1 = 2, r2 = 1), in which T1 is red and T2 is blue. These
colored targets are displayed 250 msec after the open gray circles appear, and remain
on throughout the trial. After another 250 msec, the motion stimulus begins.
The motion stimulus consists of a field of randomly moving dots, a certain per-
centage of which were coherently moving either towards T1 or T2. The proportion
of dots moving towards the correct target is called the coherence – a trial with −3%
coherence indicates that 3% of the dots coherently move towards T2, while all of the
others move randomly. The motion stimulus remains on for 500 msec. Following mo-
tion stimulus offset, the monkey is required to maintain fixation for a variable delay
period of 300-550 msec (varied uniformly across trials within each session). After this
the fixation point disappears, cueing the monkey to report his decision with a sac-
cade to the target corresponding to the perceived direction of motion. Monkeys must
37
-
respond within 1000 msec otherwise the trial is discarded. If he chooses the correct
direction, he is rewarded according to the color of the chosen target. Until the Go!
period when the monkey can respond, fixation is enforced throughout the trial, and
breaks of fixation are penalized by aborting the trial and enforcing a time-out period
before the next trial.
Rorie et al. collected data from two monkeys which we call monkey A and monkey
T. Trials were presented in block-randomized order. For monkey A, they employed
13 possible signed coherences:
{0,±1.5%,±3%,±6%,±12%,±24%,±48%}
which, along with the four reward conditions, yields 52 conditions overall. For mon-
key T, the two lowest motion coherences ±1.5%,±3% were eliminated because this
animal’s psychophysical thresholds were somewhat higher than those of monkey A,
giving 36 conditions overall. The behavioral data analyzed here consists of 35 sessions
from monkey A (totaling 66933 trials) and 25 sessions from monkey T (totaling 32751
trials).
2.2 Predicting psychometric functions (PMFs) with
an accumulator model
We begin our model construction with the LCA from §1.3.1. More precisely, we
suppose that there are two states (x1(t), x2(t)) which represent short-term averaged
firing rates of two mutually-inhibiting pools of LIP neurons, which are sensitive to
alternatives 1 and 2, respectively. We understand that decisions are almost certainly
formulated through interactions among several oculomotor areas, but note that the
causal role of LIP has been demonstrated in [HDS06]. In the development here, each
38
-
population receives noisy sensory input from the stimulus along with input derived
from reward expectations. For clarity, we restate the LCA model equations here:
dx1 = [−γx1 − βf(x2) + I1(t)]dt+ σdW1 ,
dx2 = [−γx2 − βf(x1) + I2(t)]dt+ σdW2 ,(2.1)
where the meanings of parameters are stated in 1.3.1.
From here we reduce the linearized LCA to an OU process as presented in §1.3,
which yields a single scalar SDE for the activity difference x := x1 − x2:
dx = [λx+ A(t)]dt+ σdW , (2.2)
where λ = β − γ is the difference between leak and inhibition, A(t) = I1(t) − I2(t),
and dW = dW1 − dW2 are independent white noise increments. To complete the
model formulation we observe that our experiment is performed according to the
interrogation protocol with interrogation time T , as the motion stimulus stays on for
a fixed duration, after which monkeys are required to make a response (an eye saccade
to T1 or T2) which reports the direction of motion2. Note from the definition of x that
evidence for alternative 1 manifests itself as positive drift or A = I1 − I2 > 0, and for
alternative 2 vice versa, and that we have a correct response if sign(x(T )) = sign(A)
(since coherence is fixed during a trial, A(t) will be assumed constant during the
stimulus period, although others have assumed variable drift rates for OU processes
[EHL+08]).
The discussion from §1.3.3 reminds us that if λ = 0, Eq. 2.2 describes a drift
diffusion (DD) process, which is a continuum limit of the sequential probability ratio
test (§1.2.3) and is optimal for 2AFC tasks in that, given a fixed decision time, it
maximizes accuracy. Since we have no reason for assuming that λ = 0, we elect
2We choose not to explicitly model the delay and response periods, instead electing to assumethat the monkeys’ responses are locked in as soon as the stimulus terminates at time T .
39
-
to use the Orstein-Uhlenbeck (OU) model and proceed to derive expressions with
which we can study the monkeys’ behavior in our biased rewards 2AFC task. In
particular, we wish to study the monkeys’ psychometric functions (PMFs), which
represent how coherences affects accuracy for an individual. In doing so, we can
analyze the optimality of each monkey’s behavior and determine if the monkeys are
properly incorporating the additional biased reward information.
From §1.3.3 we know we can compute the probability of choosing alternative 1
under the interrogation protocol by computing the probability distribution of solu-
tions p(x, t) from the Fokker-Planck equation (1.23). If we suppose that the initial
data p(x, 0) are Gaussian with mean µ0 and variance ν0,
p(x, 0) =1√2πν0
exp
[
−(x− µ0)2
2ν0
]
, (2.3)
then the distribution of solutions of Eq. 2.2 are themselves Gaussian as time evolves:
p(x, t) =1
√
2πν(t)exp
[
−(x− µ(t))2
2ν(t)
]
, (2.4)
where
µ(t) = µ0eλt +
∫ t
0
eλ(t−s)A(s) ds and ν(t) = ν0e2λt +
σ2
2λ
(
e2λt − 1)
(2.5)
define the integrated stimulus and integrated noise. Eq. 2.4 can be verified by directly
computing its partial derivatives. Eqs. 2.5 are now the central focus of our study,
as they are directly linked to the distribution of solutions p(x, t) through Eq. 2.4,
and, as we shall see below, are directly responsible for the shapes of the psychometric
functions. It is useful to note that in the DD limit of λ = 0, Eqs. 2.5 simplify to
µ(t) = µ0 +
∫ t
0