To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck,...

25
To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time data: Theory and implications. Psychological Review, 2, 208-218. doi:10.1037/rev0000019

Transcript of To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck,...

Page 1: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

To cite this article:

Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choicereaction time data: Theory and implications. Psychological Review, 2, 208-218.doi:10.1037/rev0000019

Page 2: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

Factoring out non-decision time in choice RT data: Theory andimplications

Stijn Verdonck and Francis TuerlinckxFaculty of Psychology and Educational Sciences

KU Leuven, University of Leuven

Choice RT experiments are an invaluable tool in psychology and neuroscience. A common as-sumption is that the total choice response time is the sum of a decision and a non-decision part(time spent on perceptual and motor processes). While the decision part is typically modeledvery carefully (commonly with diffusion models), a simple and ad hoc distribution (mostlyuniform) is assumed for the non-decision component. Nevertheless, it has been shown thatthe misspecification of the non-decision time can severely distort the decision model param-eter estimates. In this paper, we propose an alternative approach to the estimation of choiceRT models that elegantly bypasses the specification of the non-decision time distribution bymeans of an unconventional convolution of data and decision model distributions (hence calledthe D*M approach). Once the decision model parameters have been estimated, it is possibleto compute a non-parametric estimate of the non-decision time distribution. The techniqueis tested on simulated data, and is shown to systematically remove traditional estimation biasrelated to misspecified non-decision time, even for a relatively small number of observations.The shape of the actual underlying non-decision time distribution can also be recovered. Next,the D*M approach is applied to a selection of existing diffusion model application papers. Forall of these studies, substantial quantitative differences with the original analyses are found. Forone study, these differences radically alter its final conclusions, underlining the importance ofour approach. Additionally, we find that strongly right skewed non-decision time distributionsare not at all uncommon.

Keywords: choice RT, diffusion model, bias, non-decision time, misspecification, D*M

Introduction

In the last decade, the use of evidence accumulationmodels for choice RT experiments has revealed fundamen-tal insights into the process of elementary decision making(Basten, Biele, Heekeren, & Fiebach, 2010; DasGupta, Fer-reira, & Miesenböck, 2014; Forstmann et al., 2008; Krajbich& Rangel, 2011; Polanía, Krajbich, Grueschow, & Ruff,2014; Ratcliff & Dongen, 2011; Resulaj, Kiani, Wolpert, &Shadlen, 2009). A common assumption in all these stud-

The research leading to the results reported in this paperwas sponsored in part by Belgian Federal Science Policy withinthe framework of the Interuniversity Attraction Poles program(IAP/P7/06), by the grant GOA/15/003 from the KU Leuven andby the grant G.0806.13 from the Fund of Scientific ResearchFlanders. For the computations, we used the infrastructure ofthe VSC–Flemish Supercomputer Center, funded by the HerculesFoundation and the Flemish Government– Department of Economy,Science, and Innovation (EWI). We would also like to thank MartijnMulder, Gilles Dutilh and Florian Schmitz for kindly sharing theirdata and Irene Gijbels for advice.

ies, is that choice RT can be additively decomposed into adecision time (taken up by the process responsible for theactual choice) and a residual non-decision time. The latter isconsidered to be the contribution of encoding and motor re-sponse execution (Luce, 1986). In contrast to the intricate ev-idence accumulation or diffusion models for the decision partof choice RT (Brown & Heathcote, 2008; Ratcliff & Rouder,1998; Usher & McClelland, 2001; Verdonck & Tuerlinckx,2014; Wong & Wang, 2006), the models traditionally usedfor the non-decision time are surprisingly basic: adding asimple time constant, or, at best, adding a simple paramet-ric distribution (mostly uniform, see Ratcliff & Tuerlinckx,2002). However, choosing the wrong model for the non-decision time is known to be a cause for bias for the deci-sion model’s parameter estimates, especially when data orig-inating from a skewed non-decision time distribution are fit-ted under the assumption of a non-skewed non-decision timedistribution (Ratcliff, 2013). Seeing a (right) skewed non-decision time distribution is at least as plausible as the cur-rent non-skewed default. Therefore, today’s golden standardfor the non-decision time may be systematically deliveringbiased results for the decision process parameters.

In this paper, we develop a novel estimation approach that

1

Page 3: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

2 VERDONCK AND TUERLINCKX

bypasses the specification of the non-decision time distribu-tion and allows us to estimate the decision model’s param-eters without any non-decision time misspecification bias.Moreover, if desired, a non-parametric estimate of the resid-ual non-decision time can be estimated.

Attempts to disentangle non-decision from decision timehave been proposed before (Smith, 1990), but from a differ-ent perspective: the non-decision time distribution was as-sumed to coincide with the distribution of simple (i.e., one-choice) response times, observed in a comparable experi-mental setup. This distribution was then deconvolved fromthe distribution of total choice response times (from the orig-inal two-choice version of the experiment) in order to isolatethe decision part. The resulting distributions could then bemodeled with a decision model. The approach we proposein this paper, does not use information of secondary experi-ments nor does it presume any particular parametric distribu-tion (shape) for the non-decision time distribution.

In what follows we will first explain the method, then il-lustrate its performance based on simulated data, and finallyuse it to re-analyze data from three different diffusion modelapplication papers.

The D*M method

Assume a two-choice RT experiment with I different con-ditions (i = 1, ..., I). These conditions can be stimuli ofdifferent difficulty (e.g., percentages coherently moving dotsin a random dots motion task) but also other manipulations(e.g., prior expectations). During the course of the exper-iment, there are several trials within a condition, and eachtrial can results in either a correct (c = 1) or an error choiceresponse (c = 0). For the sake of simplicity, we assumean equal number of N trials. Let us start with focusing onthe theoretical probability density function (pdf) of the totalresponse time for condition i and choice response c. For con-venience, we introduce an index p = 2i − c, running over all2I condition-response pairs ic: correct choice responses haveodd p, incorrect choice responses have even p. This total re-sponse time pdf can be seen as a convolution of a decisionpdf (denoted as mp(t; θ0)) and a non-decision pdf (denoted asr(t)):

fp(t) = (mp ∗ r)(t) =

ˆ ∞0

mp(t − x; θ0)r(x)dx,

where ∗ stands for the convolution operator and θ0 is the trueparameter vector of the decision pdf. Obviously, also fp(t)depends on θ0, but we suppress this dependence for simplic-ity. In what follows, we will denote the convolution productbetween densities m(t; θ0) and r(t) as m(θ0) ∗ r. Note alsothat the integral over t for fp(t) is equal to the probabilityof condition-response pair p, thus fp(t) is a defective pdf.However, because a choice response for condition i is either

Table 1List of symbols used.

Variable DescriptionI number of conditions (e.g., stimuli)p index used to indicate condition-response pairs

(0 < p ≤ 2I)mp (parametric) decision model distributionr non-decision time distributionfp total choice response time distributiongp kernel density estimate of fp

correct or incorrect,´ ∞

0 f2i−1(t)dt +´ ∞

0 f2i(t)dt = 1. For con-venience, an overview of the notation used in this paper isprovided in Table 1.

Because the convolution operator is commutative, for anytwo non-identical condition-response pairs p and p′, it holdsexactly that

r ∗ mp(θ0) ∗ mp′ (θ0) = r ∗ mp′ (θ0) ∗ mp(θ0)fp ∗ mp′ (θ0) = fp′ ∗ mp(θ0). (1)

This convolution identity is the fundament of our method:The non-decision time distribution is factored out.

The idea is to use Equation 1 to find an estimate of θ0,given observed choice RT samples from fp and fp′ on the onehand, and well-specified decision model pdfs mp and mp′ onthe other. The data enter the equality by approximating fp

and fp′ through nonparametric estimates based on the ob-servations, denoted as gp and gp′ . The estimated densities gp

and gp′ are kernel density estimators, with gp based on the Np

observed RTs t jp ( j = 1 . . .Np) for condition-response pair p:

gp(t) =1

Np

Np∑j=1

K

t − t jp

h

.As a smoothing kernel function K(t), we use a uniform dis-tribution from 0 to 1. The bandwidth is chosen to be h = 1.

When replacing fp and fp′ by gp and gp′ , the equality ofEquation 1 is no longer exact. However, the discrepancy (orobjective function)

Dpp′ (θ) = d(gp ∗ mp′ (θ), gp′ ∗ mp(θ)

), (2)

where d(·, ·) is a distance defined on the space of defectivepdfs, should be small for θ = θ0. Note that in Equation 2,a convolution appears between the data from condition-response pair p′, represented by the smoothed estimated den-sity gp′ , and the model’s density for condition-response pairp, that is mp(θ). Therefore we refer to our technique asthe D*M method. To evaluate this crucial convolution bothmodel and data densities are discretized using an equally

Page 4: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

FACTORING OUT NON-DECISION TIME IN CHOICE RT DATA 3

spaced grid and the convolution integral is then approximatedby a finite sum over this grid.

In this paper we opt for a chi-square type of distance be-tween the compared distributions

d(a, b) =

ˆ ∞0

[a(t) − b(t)]2

a(t) + b(t)dt. (3)

For a conventional chi-square distance, where typically anobserved distribution is compared to a model pdf, the in-tegrand’s denominator is the pdf of the model. Because inEquation 2, both compared distributions are a convolution ofa data and a model pdf, we take the sum of the two compareddistributions for the integrand’s denominator in Equation 3.This distance is called the triangular discrimination (Topsoe,2000) and can be considered as a symmetrized version of thechi-square distance (a “chi-square like distance”; Le Cam,1986).

It is obvious that for gp → K∗ fp and gp′ → K∗ fp′ , Dpp′ (θ)reaches its minimal value of zero, when θ = θ0 (because ofEquation 1). In Appendix A it is shown that if the decisionmodel does not contain a non-decision component of its ownand the total response time pdfs fp underlying the data arenot all equal across condition-response pairs, this minimumis unique, at least for the parameters pertaining to mp or mp′ .In this sense, Dpp′ (θ) can be used to estimate the parameterspertaining to mp and mp′ without explicitly solving the un-derlying inverse convolution problem leading to an estimateof r(t). Because we are interested in the parameters pertain-ing to all condition-response pairs, we use the total objectivefunction

T (θ) =

2I∑p=2

p−1∑p′=1

Dpp′ (θ)

, (4)

where the sum runs over all possible unique combinationsof two different condition-response pairs. The estimated pa-rameter vector θ is then found as: θ = argminT (θ).

If the true parameter vector θ0 is the absolute and uniqueminimum of all separate terms Dpp′ (at least for all param-eters pertaining to mp and mp′ ), the same will hold for thesum T (θ) (for all parameters). For θ0, the non-decision pdfr(t) clearly exists and can be retrieved as the deconvolutionof each condition-response pair p’s total response time anddecision pdfs:

r = fp ∗ mp(θ0)−1, (5)

where −1 in the exponent refers to the deconvolution opera-tion.

However, for an arbitrary parameter vector θ, it is not surethat these deconvolutions exist. Conditions for the existenceof a deconvolution are described in Appendix B and it isshown that, up to a practical degree of accuracy when dealing

with noisy data, they boil down to the following condition onthe variances σ2:

σ2fp≥ σ2

mp(θ), (6)

basically avoiding a negative variance for r(t). Because weuse a smoothing kernel K to estimate fp, the kernel needs tobe taken into account and effectively we will use the equiva-lent constraint:

σ2K∗ fp≥ σ2

K∗mp(θ) = σ2mp(θ) + σ2

K ,

where the last step follows from the additive property of vari-ances of independent random variables. In terms of the data,the constraint is:

σ2gp≥ σ2

K∗mp(θ)= σ2

mp(θ)+ σ2

K . (7)

This condition has to be met for all condition-responsepairs p. Although these conditions have to be met for allcondition-response pairs p, the validity of each of these con-straints is only as good as the estimate σ2

gp. For this reason,

densities gp(t) based on only a few data points should not beincluded. If these constraints are imposed while minimizingT (θ), the existence of a common r(t) for all pairs p is en-sured. For the final estimation of r(t), we use the followingexpression:

r = r ∗

1I

2I∑p=1

mp(θ0)

∗1

I

2I∑p=1

mp(θ0)

−1

=

1I

2I∑p=1

r ∗ mp(θ0)

∗1

I

2I∑p=1

mp(θ0)

−1

=

1I

2I∑p=1

fp

∗1

I

2I∑p=1

mp(θ0)

−1

, (8)

where both sums runs over all 2I pairs p. Having obtainedan estimate θ of the true parameter values θ0, this results inan estimator

r =

1I

2I∑p=1

gp

∗1

I

2I∑p=1

K ∗ mp(θ)

−1

, (9)

where the same smoothing kernel K is used for both factorsof the deconvolution. The constraints in Equation 7 ensurethe existence of this deconvolution. If the deconvolution ex-ists, then clearly for gp → K ∗ fp, and therefore θ → θ0,r(t) → r(t). Note that the smoothing kernel K used in thisfinal step can differ from the kernel used in the minimiza-tion procedure, but has to be applied in a recalculation ofgp(t) as well (for the deconvolution, we will use a uniformdistribution from 0 to 0.01 for K). In practice, we solve thedeconvolution problem in Equation 9 by defining a grid (we

Page 5: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

4 VERDONCK AND TUERLINCKX

take a grid spacing of 0.01s) and minimizing the triangulardiscrimination distance (see Equation 3) between the data-based distribution

(1I∑2I

p=1 gp

)and the model-based distribu-

tion r ∗(

1I∑2I

p=1 K ∗ mp(θ))

with respect to the probabilityweights assigned to the grid points. We use the same globaloptimizer as before to tackle this high dimensional minimiza-tion problem: for a grid spacing of 0.01s and r clipped at1.5s, r consists of 150 grid points whose weights have to beestimated. We again use the triangular discrimination dis-tance for consistency, but for this estimation other distancesor could be used as well.

In the theory developed above, we have assumed that all2I condition-response pairs share the same non-decision timedistribution r. Note however that the total objective func-tion T (θ) in Equation 4 can be easily changed to allow formultiple non-decision time distributions, each shared withina subset (with 2 elements or more) of condition-responsepairs. This setup is the analogue of a traditional diffusionmodel analysis in which the non-decision time parameter Ter

is allowed to vary across (some of the) conditions, while theother parameters are constrained to be equal. In order to im-plement such a situation, it suffices to limit the double sumin Equation 4 to terms produced within subsets of condition-response pairs with the same non-decision time distribution.Each non-decision time distribution can then be estimatedseparately by limiting the sums in Equation 9 to the subset ofcondition-response pairs corresponding to that non-decisiontime distribution.

Performance on simulated data

In this section we show how well the D*M techniquerecovers a standard parameter set of the Ratcliff diffusionmodel (Ratcliff & Tuerlinckx, 2002), in conjunction withdifferent non-decision time distributions. The quality of theestimates is systematically compared to the results obtainedwith a standard approach in which it is assumed that the non-decision time is uniformly distributed.

Data simulation

We simulated data from a typical Ratcliff diffusion modelparameter set (Table 2) in conjunction with three differentnon-decision pdfs (see the righthandside figures of the threepanels in Figure 1): one right skewed, one uniform and onebimodal distribution. For all three parameter configurations,we look at data sets of 300, 1,000 and 1,000,000 RTs perstimulus. For each combination of sample size and non-decision time distribution, we simulated 100 data sets.

Standard and D*M estimation procedures

For the standard estimation procedure we used both thestandard maximum likelihood method and the quantile likeli-hood method (Heathcote & Brown, 2004; Heathcote, Brown,

Table 2Ratcliff diffusion model parameters used for the decision partof the simulated data.

Parameter Description Valuea boundary separation 0.08η inter-trial variability 0.08

of drift ratezr relative bias 0.5sz uniform inter-trial 0.02

variability of bias z = a · zrvi (i = 1, . . . , 4) drift rates [0.4, 0.25, 0.1, 0]

& Mewhort, 2002). As the results are very comparable, weonly show results of the latter. As is customary, a uniformdistribution for the non-decision time distribution was usedand the location and spread of this uniform distribution wasestimated alongside the pure decision model parameters.

In the D*M procedure no non-decision pdf has to (norshould) be included. To evaluate the convolution integral, wechoose a grid ranging from 0 to 5s (well beyond any observedRT) with equally spaced nodes at every 0.01s.

For both methods (standard and D*M), the model densitym(θ) needs to be calculated. To do this, we use code fromthe fast-dm project (Voss & Voss, 2007). Also in both meth-ods, an objective function has to be minimized (the negativequantile likelihood for the standard method and Equation 4for D*M). To find this minimum, we use a global optimizer(i.e., differential evolution; Storn & Price, 1997) .

Results

Figure 1 shows the estimates for the boundary separationa (results for the other parameters are offered as supplemen-tal material), in conjunction with three different shapes ofnon-decision time densities. The traditional approach resultsin a systematic estimation bias, except for the uniform non-decision pdf, in which case the non-decision model is per-fectly specified for the data. This systematic bias is resolvedwith the D*M method, which gives better estimates (or com-parable in the unlikely case that the actual non-decision timedensity is a uniform distribution), even for a number of datapoints as low as 300 per condition. In addition, the aver-age recovered non-decision time densities match remarkablywell with the true non-decision time densities.

As an additional test, we have repeated this analysis fordifferent values of boundary separation a. Increasing valuesof a correspond to a larger proportion of the total responsetime (and its variance) being accounted for by the decisionmodel (keeping everything else constant). The results arepresented in Figure 2. For the right skewed non-decisiontime distribution, the D*M method always outperforms (ormatches) the traditional method, regardless of the value ofa or the number of observations. For the uniform and bi-

Page 6: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

FACTORING OUT NON-DECISION TIME IN CHOICE RT DATA 5

0.07

0.08

0.09

0.1

a

N=106 N=103 N=300

RT (s)0.2 0.4 0.6

dens

ity0

2

4

6

8non-decision pdf

(a) right skewed non-decision pdf

0.075

0.08

0.085

0.09

0.095

a

N=106 N=103 N=300

RT (s)0.2 0.4 0.6

dens

ity

0

5

10

15

non-decision pdf

(b) uniform non-decision pdf

0.075

0.08

0.085

0.09

a

N=106 N=103 N=300

RT (s)0 0.5

dens

ity

0

2

4

6

non-decision pdf

(c) bimodal non-decision pdfFigure 1. In the left panels, the recovery of the Ratcliff diffu-sion model’s boundary separation parameter a is shown, fordata simulated with either a right skewed non-decision pdf(a), a uniform non-decision pdf (b) or a bimodal combina-tion of both (c). Estimates are shown for data sets with sub-sequently N = 106, N = 103 and N = 300 observations percondition. The orange box plots show the distributions of theestimates obtained with the traditional method, the blue boxplots those obtained with the D*M method. (The recoveryresults for the other diffusion model’s parameters yield verysimilar conclusions and are offered as supplemental mate-rial.) In the right panels, we show for each non-decision timedensity, the average of the non-parametric estimates basedon the D*M decision model parameter estimates as given byEquation 9. The average is represented by solid lines varyingfrom light to dark with the increasing number of observationsand compared to the original shape (dashed black line).

a0.04 0.06 0.08 0.1 0.12

bias

0

0.02

0.04

(a) right skewed non-decision pdf

a0.04 0.06 0.08 0.1 0.12

bias

0

0.02

0.04

(b) uniform non-decision pdf

a0.04 0.06 0.08 0.1 0.12

bias

0

0.02

0.04

(c) bimodal non-decision pdfFigure 2. Estimation biases of boundary separation a, forvarying a. The same non-decision time distributions wereused as in Figure 1: a right skewed non-decision pdf (a), auniform non-decision pdf (b) or a bimodal combination ofboth (c). Orange lines correspond to traditional estimates,blue lines to D*M estimates. The estimates are based ondata sets with subsequently N = 106 (solid lines), N = 103

(dashed lines) and N = 300 (dotted-dashed lines) observa-tions per condition. If the number of trials is high (solidlines), the D*M method always outperforms (or matches) thetraditional method, no matter what the value is of a. For lessdata (dashed lines, dotted-dashed lines), D*M still performsbetter for normal and high values of a. For a small amount ofdata and small values of a, however, and if the non-decisiontime distribution is more in line with the traditional uniformassumption like it is in (b) and (c), the D*M method shows ahigher bias than the traditional method. Similar patterns arefound for the other parameters.

Page 7: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

6 VERDONCK AND TUERLINCKX

modal non-decision time distributions, D*M still performsbetter or equal for normal and high values of a, but can havehigher biases for small values of a, if there is a limited num-ber of observations. This is what one would expect: As theproportion of total response time variance generated by thenon-decision time process gets larger (the non-decision timedistribution proportions of total variance at a = 0.04 are0.91,0.47 and 0.87, for right-skewed, uniform and bimodal,respectively), the decision part becomes increasingly obfus-cated by the non-decision part. As the D*M method onlyuses a parametric model for the decision part, more obser-vations are needed to correctly disentangle the two compo-nents. Thus, if (1) the uniform distribution is a good enoughproxy for the actual non-decision time distribution, (2) theactual non-decision time distribution contributes enough tothe total response time and (3) there is not a lot of data, thetraditional method can outperform the D*M method. It hasto be noted however that, unless we have some prior infor-mation about the real shape of the non-decision time dis-tribution, the first condition can never be checked in a realworld problem. If one wants to avoid results that depend onany particular presupposed shape of the non-decision timedistribution, the D*M approach is by definition the betterchoice. Figure 3 shows the non-parametric estimates of thenon-decision time distributions following the D*M estimatesshown in Figure 2. Only for the smallest boundary sepa-ration a = 0.04 and a limited number of observations, thenon-parametric estimates of the non-decision time distribu-tion is somewhat biased, but even then the main features ofthe distributions are recovered.

For the estimation of the decision model parameters, therepeated calculation of the complete decision model pdfs ona reasonably detailed grid is by far the most computationallyexpensive part for both the traditional and the D*M method.In our implementation, a single evaluation of the D*M ortraditional objective function takes up a comparable amountof time. We opted for a global minimization routine to re-duce potential problems concerning local minima, but thisresulted in optimization times that are probably a lot slowerthan strictly necessary. Running single-threaded on an i7core clocked at 3.60GHz, both a single D*M estimate andtraditional estimate take around 5 minutes to complete. Theestimation of the non-decision time (once D*M parameterestimates have been obtained) takes about 30 minutes for agrid with 0.01s spacing but only about 15 seconds for a gridwith 0.05s spacing. To be confident we reached convergence,we repeated every estimation multiple times (with a differentpopulation of starting values). The D*M estimates seemedsomewhat more robust than the traditional estimates: for theD*M objective function almost every repetition resulted inthe same global minimum; for the traditional objective func-tion, ending up in a local minimum was a bit more frequent.By repeating the minimization procedure multiple times (5

RT (s)0.2 0.4 0.6

dens

ity

0

5

10

N=300

RT (s)0.2 0.4 0.6

dens

ity

0

2

4

6

8

N=1000

(a) right skewed non-decision pdf

RT (s)0.2 0.4 0.6

dens

ity

0

5

10

15

N=300

RT (s)0.2 0.4 0.6

dens

ity

0

5

10

15

N=1000

(b) uniform non-decision pdf

RT (s)0 0.5

dens

ity

0

5

10

15N=300

RT (s)0 0.5

dens

ity0

5

10

N=1000

(c) bimodal non-decision pdfFigure 3. Recovery of the respective non-decision time dis-tributions based on data sets generated with varying bound-ary separation a, with N = 300 (left) and N = 1000 (right)observations per condition. Red lines indicate the non-decision time distributions obtained with the lowest value ofboundary separation a = 0.04; the rest of the non-decisiontime distributions (with a somewhere between 0.06 and 0.12)are in black.

times for D*M and 10 times for the traditional method), werepeatedly found a lowest minimum and were convinced ofconvergence. An alternative for the global optimizer may bea local optimizer in combination with a rational starting point(e.g., using EZ diffusion, Wagenmakers, Maas, & Grasman,2007). Such a routine may yield equally good results andwill be much faster. However, because of the novelty of theD*M method, we were more concerned with accuracy (i.e.,avoiding local optima) than with speed.

Page 8: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

FACTORING OUT NON-DECISION TIME IN CHOICE RT DATA 7

Existing diffusion model analyses revisited

To illustrate how the D*M parameter estimation methodcan lead to fundamentally different findings than the tradi-tional method, we re-analyze the data of three choice RTstudies in which a diffusion model is used. These paperscan be seen as typical examples of today’s common practiceof using diffusion models parameters to explain differencesbetween (groups of) participants and/or experimental condi-tions. We only present the main results; more detailed resultsare offered as supplemental material.

Application 1: A diffusion model analysis of choice bias(Mulder, Wagenmakers, Ratcliff, Boekel, & Forstmann,2012)

Mulder et al. (2012) investigate, using a moving-dot per-ceptual decision making experiment, how different types ofpre-trial information biases people towards one of two choicealternatives. One type of pre-trial information concerns theelevated prior likelihood of the occurrence of one of twochoices, the other type involves a larger potential pay-off forone of the two choices. The diffusion model is used to studywhich decision process aspects are affected by each form ofpre-trial information (elevated prior likelihood or larger pay-off, each requiring a separate diffusion model analysis): Onepossibility is that the bias is caused by a shift ∆ν in the dif-fusion model’s drift rate parameter, the other possibility isthat the bias is caused by a shift ∆zr in the starting positionparameter. The results are shown in Figure 4.

The authors concluded that it is mainly the starting po-sition parameter that is responsible for the bias introducedby both elevated prior likelihood and larger potential pay-off

conditions. Upon re-analysis we come to the same quali-tative conclusion, for both the traditional and D*M proce-dure. Quantitatively, however, there are clear differences be-tween the D*M and the traditional estimates. From Figure 4,it can be seen that the boundary separation parameter a issystematically lower for the D*M estimates and the result-ing between-person variance (as indicated by the box plots)is considerably smaller than that of the traditional estimates.Clearly, the assumption of a uniform non-decision time den-sity is not valid; instead, the non-decision time density esti-mates indicate a strong right skew. However, in this case, thismisspecification does not yield a different qualitative conclu-sion.

Application 2: A diffusion model analysis of post-errorslowing (Dutilh et al., 2011)

A well documented observation in choice RT experimentsis that the response time increases on a trial immediatelyfollowing an incorrect choice. This phenomenon is calledpost-error slowing (PES). Dutilh et al. (2011) have attemptedto isolate the diffusion model parameters that can account

0.1

0.12

0.14

0.16

0.18

a

-0.1

0

0.1

0.2

0.3

∆zr∆v

(a) traditional

0.1

0.12

0.14

0.16

0.18

a

-0.1

0

0.1

0.2

0.3

∆zr ∆v

(b) D*M

RT (s)0 0.5 1 1.5

0.2

0

0.2 prior r(t)

reward r(t)

(c) non-decision pdfs

prior r(t) (s)0.5 1

rew

ard

r(t)

(s)

0.4

0.6

0.8

1

(d) Q-Q plot non-decision pdfsFigure 4. Application 1: A selection of parameter estimates(boundary separation a and effects of bias on starting point∆zr and drift rate ∆v ) of a diffusion model analysis of choicebias for 20 participants (for a complete parameter overviewplease see supplemental material). The first two panels showthe estimates of a selection of decision model parameters,obtained with either the traditional (a, in orange) or D*Mmethod (b, in blue). For each of the two panels, the darkerbox plots show the estimates for the elevated prior likelihoodcondition, the lighter plots for the larger potential pay-off

condition. If for the group of participants ∆zr or ∆v is signif-icantly different from 0 (two-sided sign test), this is indicatedwith a green (p < 0.001) or light green (p < 0.01) markingof the label. Panel (c) shows the non-decision time densitiesinferred from the D*M estimates in panel (b). The densitiesfor the elevated likelihood condition are shown in the up-per half of the plot and those for the larger potential pay-off

condition are shown, mirrored, in the lower half. The solidlines show the mean non-decision pdfs across participants,the lighter areas display the double standard error interval.Panel (d) is a quantile-quantile plot of the data in panel (c),and is better suited to look at the differences between thenon-decision pdfs from the two conditions. The grey arearepresents a 95% confidence interval of the mean quantile-quantile values (black crosses).

Page 9: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

8 VERDONCK AND TUERLINCKX

for the differences between post-correct and post-error tri-als of a lexical decision task. Participants have to classifya string of letters as a word or non-word. Besides the ob-vious word vs. non-word manipulation, there are six dif-ferent word frequency types. In their diffusion model anal-yses, the authors estimate a separate non-decision parame-ter Ter for every stimulus condition. Analogously, the D*Mmethod is set-up to allow for a separate non-decision timedistribution for each stimulus condition (grouping choice RTdistributions that share a non-decision time distribution) asexplained in the last paragraph of the theory section. Theresults are shown in Figure 5. Based on their analysis, whichinvolves the estimation of two separate diffusion models (onefor the post-error and one for the post-correct condition),the authors concluded that post-error slowing is very muchassociated with an increase in response caution. A similaranalysis with the D*M method, however, shows no associ-ation between PES and response caution, but reveals otherassociations, namely with ηW and most word drift rates vi

(i = 2, . . . , 6). More specifically, after an error, the driftrates for words become smaller (i.e., closer to zero) and thetrial-to-trial variability of word drift rates also decreases. Thereason for the difference between the original and the D*Manalysis lies with the specification of the non-decision pdf.The authors allow a different mean non-decision time for allstimuli, but assume an equal, uniform width. As can be seenin Figure 5, panel (d), the D*M method suggests both meanand variance of the non-decision time increase after an errortrial when judging a word stimulus (shown in black in panel(d) in Figure 5), but the non-decision time does not changeafter an error trial when judging a non-word stimulus (shownin red in panel (d) in Figure 5). These effects could not beaccommodated by the author’s particular specification of thenon-decision time distribution, so other parameters had tocompensate, with different results and conclusions as a con-sequence. Based on our analysis, we have to conclude that inthe context of this particular diffusion model, there is a fun-damental difference in the processing of word and non-wordstimuli following an error trial.

As for all our re-analyses, we rigorously implementedthe model assumptions of the original paper. However, it istempting to wonder if just allowing the width of the uniformnon-decision time distribution to vary across stimuli in thetraditional approach, would be sufficient to get traditional re-sults comparable to the D*M analysis. To answer this ques-tion, we performed the extra analysis and a similar picture asFigure 5, panel (a) was obtained (see supplemental material),showing that the absence of skew in the non-decision timedistribution specification is an essential component of themisspecification. As a possibly viable traditional alternativeto our method, one could now suggest to have mean, skewand width as parameters of some new non-decision specifi-cation, but this would mean estimating 35 parameters, (21

0

0.02

0.04

a

-0.2

0

0.2

0.4

ηWη

NW

-0.2

0

0.2

0.4

v1

v2

v3

v4

v5

v6

vNW

(a) traditional

0

0.05

0.1

a

-1

-0.5

0

0.5

1

ηWη

NW

-1

0

1

v1

v2

v3

v4

v5

v6v

NW

(b) D*M

RT (s)0 0.5 1 1.5

0.2

0

0.2

post-correct r(t)

post-error r(t)

(c) non-decision pdfs

post-correct r(t) (s)0.4 0.6 0.8

post

-err

or r

(t)

(s)

0.4

0.6

0.8

(d) Q-Q plot non-decision pdfsFigure 5. Application 2: A selection of estimation results of a dif-fusion model analysis of post-error slowing for 39 participants (fora complete parameter overview please see supplemental material).Panel (a) shows the within-person differences in traditional param-eter estimates between post-error and post-correct conditions forboundary separation a, inter-trial variability in drift rate for wordsand non-words (ηW and ηNW ), and the drift rates for six differentword types and non-words (v1, ..., v6 and vNW ). Statistically signif-icant effects for the differences (two-sided sign test) are indicatedwith a green/red (p < 0.001) or light green/light red (p < 0.01)marking of the label (green means a positive effect or a largervalue post-error compared to post-correct, red a negative effect ora smaller value post-error compared to post-correct). Panel (b) isthe D*M version of panel (a). Panel (c) shows the participant aver-aged non-decision time densities inferred from the D*M estimates,separately for all six different word types (black) and non-words(red). The non-decision time densities of the post-correct conditionare shown in the upper half, those of the post-error condition areshown, mirrored, in the lower half. Panel (d) is a quantile-quantileplot of the data in panel (c), and is better suited to look at the differ-ences between the non-decision time densities from the two condi-tions. (Because of the many non-decision time pdfs, no confidenceintervals are shown in panels (c) and (d).)

Page 10: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

FACTORING OUT NON-DECISION TIME IN CHOICE RT DATA 9

non-decision parameters on top of the 14 decision model pa-rameters), while D*M can handle the problem with only the14 decision model parameters. Also, there is no guaranteethat this alternative parametric non-decision time distributionwill (always) suffice.

Application 3: A diffusion model analysis of task switch-ing costs (Schmitz & Voss, 2012)

Schmitz and Voss (2012) investigate which diffusionmodel parameters can best explain task-switching costs. Welimit our analyses to the first experiment in the paper, whichcompares task-switching and task-repeating trials from aclassical alternating runs paradigm to each other and to puretask trials (no task-switching within an experimental block).In these analyses, three separate diffusion models are esti-mated (with both methods): one for the pure task trials, onefor the task-repeating trails and one for the task-switchingtrials. Partial results are shown in Figure 6. The main qual-itative findings of the original paper pertaining to this ex-periment, namely a positive change in boundary separationa (or caution) and a negative change in drift rate betweentask-switching and pure task trials, were confirmed by ourre-analysis, for both the traditional and the D*M method.Participants are more cautious in the task switch conditionand at the same time, they process the information less well.Quantitatively, the parameter estimates again differ consider-ably from the original study. It can also be seen that the non-decision time distribution in the task switching condition hasa much more outspoken skew to the right (compared to thepure task non-decision time pdf).

Discussion

Traditional parameter estimates of decision models tochoice RT data, have been shown to be vulnerable to themisspecification of the extra non-decision component (Rat-cliff, 2013). In this paper, we have proposed a solution to thisproblem. By means of a handy convolution between data anddecision model distributions, hence called the D*M method,we were able to factor out the non-decision time distributionfrom the estimation procedure. Through theory and a sim-ulation study, the method was shown to remediate the tradi-tional bias related to the misspecification of the non-decisiontime distribution, even for a limited number of data points.Additionally, the actually simulated non-decision time distri-butions could systematically be recovered. To illustrate themethod’s relevance, we applied it to three existing diffusionmodel application papers. For all studies, we found substan-tial differences with the traditional parameter estimates; inone case, using the D*M method radically altered the con-clusions of the original paper, clearly demonstrating the ne-cessity of the method.

In two of the three applications we re-analyzed, non-parametric estimates of the non-decision time distributions

-0.1

0

0.1

0.2

a

-1

-0.5

0

v

(a) traditional

-0.1

0

0.1

0.2

a

-1

-0.5

0

v

(b) D*M

RT (s)0 0.5 1

0.4

0.2

0

0.2

0.4 pure task r(t)

task-swiching r(t)

(c) non-decision pdfs

pure task r(t) (s)0.4 0.6

task

-sw

itchi

ng r

(t)

(s)

0.3

0.4

0.5

0.6

(d) Q-Q plot non-decision pdfsFigure 6. Application 3: A selection of estimation resultsof a diffusion model analysis of task switching costs for 24participants (for a complete parameter overview please seesupplemental material). Panel (a) shows the within-persondifferences in traditional parameter estimates between task-switching trials and pure task blocks for boundary separationa and drift rate v. Statistically significant effects for the dif-ferences (two-sided sign test) are indicated with a green/red(p < 0.001) or light green/ light red (p < 0.01) indicator(green means a larger parameter value in the task switch con-dition compared a pure task condition, red means smallervalues). Panel (b) is the D*M version of panel (a). Panel(c) shows the participant averaged non-decision time densi-ties inferred from the D*M estimates in panel (b). The non-decision time densities of the pure task condition are shownin the upper half of the plot, those of the task-switching trialsare shown, mirrored, in the lower half. The solid lines showthe mean non-decision pdfs across participants, the lighterareas display the double standard error interval. Panel (d) isa quantile-quantile plot of the data in panel (c), and is bet-ter suited to look at the differences between the non-decisionpdfs from the two conditions. The grey area represents a95% confidence interval of the mean quantile-quantile values(black crosses).

Page 11: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

10 VERDONCK AND TUERLINCKX

revealed a clear right skew (application 1 and 2). In oursimulation study we have shown that ignoring such a rightskew may cause severe biases in the diffusion model param-eter estimates. Therefore, in the absence of a good model fornon-decision time, we strongly advise the use of the D*Mmethod for estimating choice RT models. Additionally, thenon-parametric estimates of the resulting non-decision timedistributions give a detailed picture of what non-decisiontime can look like (assuming the decision model itself iswell-specified that is), which could in turn be used as a guideor template for constructing more intricate models for non-decision time.

References

Basten, U., Biele, G., Heekeren, H. R., & Fiebach, C. J.(2010, December). How the brain integrates costsand benefits during decision making. Proceedings ofthe National Academy of Sciences, 107(50), 21767–21772. doi: 10.1073/pnas.0908104107

Brown, S. D., & Heathcote, A. (2008, November). The sim-plest complete model of choice response time: Linearballistic accumulation. Cognitive Psychology, 57(3),153–178. doi: 10.1016/j.cogpsych.2007.12.002

Cressie, N., & Holland, P. W. (1983, March). Char-acterizing the manifest probabilities of latent traitmodels. Psychometrika, 48(1), 129–141. doi:10.1007/BF02314681

DasGupta, S., Ferreira, C. H., & Miesenböck, G. (2014,May). FoxP influences the speed and accuracy of a per-ceptual decision in Drosophila. Science, 344(6186),901–904. doi: 10.1126/science.1252114

Dutilh, G., Vandekerckhove, J., Forstmann, B. U., Keuleers,E., Brysbaert, M., & Wagenmakers, E.-J. (2011,November). Testing theories of post-error slowing. At-tention, Perception, & Psychophysics, 74(2), 454–465.doi: 10.3758/s13414-011-0243-2

Feller, W. (1971). An Introduction to Probability Theory andIts Applications. John Wiley & Sons.

Forstmann, B. U., Dutilh, G., Brown, S., Neumann, J., Cra-mon, D. Y. v., Ridderinkhof, K. R., & Wagenmakers,E.-J. (2008, November). Striatum and pre-SMA facil-itate decision-making under time pressure. Proceed-ings of the National Academy of Sciences, 105(45),17538–17542. doi: 10.1073/pnas.0805903105

Heathcote, A., & Brown, S. (2004, June). Reply to Speck-man and Rouder: A theoretical basis for QML. Psy-chonomic Bulletin & Review, 11(3), 577–578. doi:10.3758/BF03196614

Heathcote, A., Brown, S., & Mewhort, D. J. K. (2002, June).Quantile maximum likelihood estimation of responsetime distributions. Psychonomic Bulletin & Review,9(2), 394–401. doi: 10.3758/BF03196299

Karlin, S., & Studden, W. J. (1966). Tchebycheff sys-tems: with applications in analysis and statistics. In-terscience Publishers.

Krajbich, I., & Rangel, A. (2011, August). Mul-tialternative drift-diffusion model predicts the re-lationship between visual fixations and choice invalue-based decisions. Proceedings of the NationalAcademy of Sciences, 108(33), 13852–13857. doi:10.1073/pnas.1101328108

Kreyszig, E. (2010). Advanced Engineering Mathematics.John Wiley & Sons.

Le Cam, L. (1986). Asymptotic Methods in Statistical Deci-sion Theory. New York, NY: Springer New York.

Luce, R. D. (1986). Response times: Their role in inferringelementary mental organization. New York: OxfordUniversity Press.

Mulder, M. J., Wagenmakers, E.-J., Ratcliff, R., Boekel, W.,& Forstmann, B. U. (2012, February). Bias in thebrain: A diffusion model analysis of prior probabil-ity and potential payoff. The Journal of Neuroscience,32(7), 2335–2343. doi: 10.1523/JNEUROSCI.4156-11.2012

Nobile, A. G., Ricciardi, L. M., & Sacerdote, L. (1985,September). Exponential Trends of First-Passage-Time Densities for a Class of Diffusion Processes withSteady-State Distribution. Journal of Applied Proba-bility, 22(3), 611–618. doi: 10.2307/3213864

Polanía, R., Krajbich, I., Grueschow, M., & Ruff, C. C.(2014, May). Neural oscillations and synchroniza-tion differentially support evidence accumulation inperceptual and value-based decision making. Neuron,82(3), 709–720. doi: 10.1016/j.neuron.2014.03.014

Ratcliff, R. (2013, January). Parameter variabil-ity and distributional assumptions in the diffusionmodel. Psychological Review, 120(1), 281–292. doi:10.1037/a0030775

Ratcliff, R., & Dongen, H. P. A. V. (2011, July). Diffu-sion model for one-choice reaction-time tasks and thecognitive effects of sleep deprivation. Proceedings ofthe National Academy of Sciences, 108(27), 11285–11290. doi: 10.1073/pnas.1100483108

Ratcliff, R., & Rouder, J. N. (1998). Modeling ResponseTimes for Two-Choice Decisions. Psychological Sci-ence, 9(5), 347 –356. doi: 10.1111/1467-9280.00067

Ratcliff, R., & Tuerlinckx, F. (2002, September). Estimat-ing parameters of the diffusion model: Approaches todealing with contaminant reaction times and parame-ter variability. Psychonomic Bulletin & Review, 9(3),438–481. doi: 10.3758/BF03196302

Resulaj, A., Kiani, R., Wolpert, D. M., & Shadlen,M. N. (2009, September). Changes of mind indecision-making. Nature, 461(7261), 263–266. doi:10.1038/nature08275

Page 12: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

FACTORING OUT NON-DECISION TIME IN CHOICE RT DATA 11

Schmitz, F., & Voss, A. (2012). Decomposing task-switchingcosts with the diffusion model. Journal of Experimen-tal Psychology: Human Perception and Performance,38(1), 222–250. doi: 10.1037/a0026003

Smith, P. L. (1990). Obtaining meaningful results fromFourier deconvolution of reaction time data. Psycho-logical Bulletin, 108(3), 533–550. doi: 10.1037/0033-2909.108.3.533

Storn, R., & Price, K. (1997). Differential evolution–a sim-ple and efficient heuristic for global optimization overcontinuous spaces. Journal of Global Optimization,11(4), 341–359.

Topsoe, F. (2000, July). Some inequalities for informa-tion divergence and related measures of discrimina-tion. IEEE Transactions on Information Theory, 46(4),1602–1609. doi: 10.1109/18.850703

Usher, M., & McClelland, J. L. (2001). The time course ofperceptual choice: The leaky, competing accumulatormodel. Psychological Review, 108(3), 550–592. doi:10.1037/0033-295X.108.3.550

Verdonck, S., & Tuerlinckx, F. (2014). The Ising Deci-sion Maker: A binary stochastic network for choiceresponse time. Psychological Review, 121(3), 422–462. doi: 10.1037/a0037012

Voss, A., & Voss, J. (2007, November). Fast-dm: Afree program for efficient diffusion model analysis.Behavior Research Methods, 39(4), 767–775. doi:10.3758/BF03192967

Wagenmakers, E.-J., Maas, H. L. J. V. D., & Grasman,R. P. P. P. (2007, February). An EZ-diffusion modelfor response time and accuracy. Psychonomic Bulletin& Review, 14(1), 3–22. doi: 10.3758/BF03194023

Wong, K.-F., & Wang, X.-J. (2006, January). A RecurrentNetwork Mechanism of Time Integration in PerceptualDecisions. The Journal of Neuroscience, 26(4), 1314–1328. doi: 10.1523/JNEUROSCI.3733-05.2006

Appendix AUnique global optimum

In this appendix, we show under which conditions the ob-jective function Dpp′ (θ) of Equation 2, for gp → K ∗ fp andgp′ → K∗ fp′ , has a unique global optimum at the true param-eter vector θ0. We will start from the reasonable assumptionthat the decision model does not contain a non-decision com-ponent of its own at the true parameter vector θ0, or, morespecifically, that for the true parameter vector θ0, there existsno distribution u(t) and parameter vector θu , θ0 such thatfor every p:

mp(θ0) = mp(θu) ∗ u. (10)

Imagine for a moment that such a u(t) and θu , θ0 do exist.The fundamental Equation 1 may then be transformed as (for

every p, p′):

fp ∗ mp′ (θ0) = fp′ ∗ mp(θ0)fp ∗ mp′ (θu) ∗ u = fp′ ∗ mp(θu) ∗ u

fp ∗ mp′ (θu) = fp′ ∗ mp(θu)mp(θ0) ∗ r ∗ mp′ (θu) = mp′ (θ0) ∗ r ∗ mp(θu)

mp(θ0) ∗ mp′ (θu) = mp′ (θ0) ∗ mp(θu),

where we make use of the fact convolution is a commutativeoperator and that all factors have a Laplace transform (seebelow in section A.2 for more information) such that theycan be safely deconvolved from both sides. Thus, the finalidentity is a direct consequence of the presence of a non-decision component inside the decision model at θ0. Notethat another way this final identity can be reached, is whenfor all p, p′: mp(θ0) = mp′ (θ0) and, as a consequence ofEquation 10, mp(θu) = mp′ (θu). This would mean that alltotal response time pdfs fp would be equal across condition-response pairs. Now we can move to the actual proposition.

Proposition

If the decision model itself does not contain a com-mon non-decision component that can be factored out, ormore specifically, if there does not exist a parameter θu , θ0such that for every p, p′:

mp(θ0) ∗ mp′ (θu) = mp′ (θ0) ∗ mp(θu), (11)

the minimum of

Dpp′ (θ) = d(gp ∗ mp′ (θ), gp′ ∗ mp(θ)

),

for gp → K ∗ fp and gp′ → K ∗ fp′ at θ = θ0 is unique.

Proof

Assume there does exist another global minimum atθu , θ0 for which

Dpp′ (θu) = d(gp ∗ mp′ (θu), gp′ ∗ mp(θu)

)= 0

for gp → K ∗ fp and gp′ → K ∗ fp′ .Then, for every p, p′

K ∗ fp ∗ mp′ (θu) = K ∗ fp′ ∗ mp(θu)K ∗ r ∗ mp(θ0) ∗ mp′ (θu) = K ∗ r ∗ mp′ (θ0) ∗ mp(θu)

Because every factor of this equation has a Laplace trans-form (see below in Appendix B for more information), wecan safely deconvolve K and r from both sides:

mp(θ0) ∗ mp′ (θu) = mp′ (θ0) ∗ mp(θu) (12)

This is in contradiction with the initial assumption ofEquation 11, which states that there is no such parameterθu , θ0. This proves the proposition.

Page 13: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

12 VERDONCK AND TUERLINCKX

Appendix BConditions for the existence of a deconvolution

In order to formulate conditions for the existence of thedeconvolution in Equation 5, we make use of cumulant-generating functions. The cumulant-generating function[G(d)](s) of a probability density function d(t) is defined as

[G(d)](s) = log

0

estd(t)dt

.Thus, the cumulant-generating function is the logarithm ofthe moment generating function. Because the range of in-tegration is the positive half of the real line, we can writethe cumulant-generating function also as the logarithm of theLaplace transform Feller (1971) evaluated at −s. Because ofthis link with the Laplace transform, we can use the existenceconditions for the Laplace transform (Kreyszig, 2010). It fol-lows that for a probability density function d(t) that is piece-wise continuous and of exponential order, i.e. d(t) ≤ Me−kt

with M and k real positive numbers, the cumulant generatingfunction [G(d)](s) exists, at least for s < k.1 When we donot need the argument s, we will also write the cumulant-generating function in short as G(d).

The cumulant-generating functionG(d) is determinedby the coefficients κd

n (n > 0) of its MacLaurin expansion, andthese are called the cumulants:

[G(d)](s) =

∞∑n=1

κdn

sn

n!.

Note that κd2 is the variance of d(t), denoted as σ2

d in the maintext.

The cumulant-generating function is a convenienttool when dealing with convolutions because it treats themadditively. Consider another pdf e(t), with cumulant generat-ing function G(e), then

G(d ∗ e) = G(d) + G(e).

Obviously, the additivity of the cumulant-generating func-tion is transferred to the cumulant coefficients:

κd∗en = κd

n + κen,

which holds for all n (n > 0).In our approach, we assume that for both the model

mp(t; θ0) and the non-decision time distribution r(t), thecorresponding cumulant-generating function exists. Mostchoice RT models are diffusion models and have an exponen-tial decay (Nobile, Ricciardi, & Sacerdote, 1985), therebysatisfying the exponential order condition. In these cases,mp(t; θ0) always has a corresponding cumulant-generatingfunction [G(m)](s; θ0) or G(m; θ0) in short. For the unknownnon-decision time distribution r(t) we will simply assumethe existence of a cumulant generating function G(r) imply-ing the existence of a cumulant-generating function G( fp) aswell.

Because the cumulants, and therefore also thecumulant-generating function, can be expressed in termsof moments, sufficient conditions for a valid cumulant-generating function can be seen as sufficient conditions fora set of moments to lead to a valid pdf. The latter condi-tions, for a non-negative random variable, are a well knownset of inequalities (Cressie & Holland, 1983; Karlin & Stud-den, 1966). More specifically, given a set of J + 1 numbersµ = (µ0, µ1, . . . , µJ), if

∆2k = det

µ0 µ1 · · · µk

µ1 µ2 · · · µk+1...

......

µk µk+1 · · · µ2k

≥ 0

(for 2k ≤ J) and

∆2k+1 = det

µ1 µ2 · · · µk+1µ2 µ3 · · · µk+2...

......

µk+1 µk+2 · · · µ2k+1

≥ 0

(for 2k + 1 ≤ J), there exists a density function d(t), so that´ ∞0 t jd(t)dt = µ j for j = 0, .., J.

For the first three inequalities (i.e., J = 2) this means:

∆0 = µ0 ≥ 0

∆1 = µ1 ≥ 0

∆2 = µ0µ2 − µ1µ1 = µ2 − µ21 = κ2 ≥ 0.

The first two inequalities are trivial and always fulfilled inthis context. The third inequality reduces to a constraint onthe second order cumulants that are being subtracted:

κr2 = κ

fp

2 − κmp

2 ≥ 0,

or

κfp

2 ≥ κmp

2 .

In terms of variances σ2 (the terminology and notation usedin the main text), this inequality becomes:

σ2fp≥ σ2

mp.

1The importance of the exponential order condition canbe shown as follows: [G(d)](s) = log

(´ ∞0 estd(t)dt

)≤

log(´ ∞

0 est Me−ktdt)

= log(M´ ∞

0 e−(k−s)tdt)

= log M − log(k − s),which shows that the integral exists for all s < k.

Page 14: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

FACTORING OUT NON-DECISION TIME IN CHOICE RT DATA 13

More intricate constraints apply to higher moments and cu-mulants. For our purposes, however, the higher order cri-teria are ignored during the minimization process, as theirestimates become increasingly volatile for a finite number ofdata points.

Page 15: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

Factoring out non-decision time in choice RT data: supplementalinformation

Stijn Verdonck and Francis TuerlinckxFaculty of Psychology and Educational Sciences

KU Leuven, University of Leuven

Page 16: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

FACTORING OUT NON-DECISION TIME IN CHOICE RT DATA 2

Simulation study

For the details on the simulation study, we refer to the main text. In Figures 1,2,3 below we present therecovery results of all diffusion model parameters as listed in Table 1. In each figure we show the results for a singlenon-decision pdf scenario’s, and three different sample sizes.

0.07

0.08

0.09

0.1

a0.4

0.45

0.5

0.55

0.6

zr0

0.05

0.1

0.15

0.2

η sz

0

0.2

0.4

0.6

0.8

v1

v2

v3

v4

0

0.1

0.2

0.3

0.4

Ter st

RT (s)0.2 0.4 0.6

dens

ity

0

2

4

6

(a) recovery diffusion model parameters for 106 observations per condition

0.07

0.08

0.09

0.1

a0.4

0.45

0.5

0.55

0.6

zr0

0.05

0.1

0.15

0.2

η sz

4

0

0.2

0.4

0.6

0.8

v1

v2

v3

v4

0

0.1

0.2

0.3

0.4

Ter st

RT (s)0.2 0.4 0.6

dens

ity

0

2

4

6

(b) recovery diffusion model parameters for 1000 observations per condition

0.07

0.08

0.09

0.1 1

a1 0.4

0.45

0.5

0.55

0.6

zr0

0.05

0.1

0.15

0.2

η sz

8

0

0.2

0.4

0.6

0.8

v1

v2

v3

v4

0

0.1

0.2

0.3

0.4

Ter st

RT (s)0.2 0.4 0.6

dens

ity

0

5

10

(c) recovery diffusion model parameters for 300 observations per conditionFigure 1 . Recovery results for a right skewed non-decision pdf scenario. The three panels (a), (b), and (c) refer tothree different sample sizes: 106, 1000 and 300 per condition. Each panel shows the estimates of the diffusion modelparameters common to both the classical analysis (orange) and the D*M analysis (blue): boundary separation a,relative bias zr, drift rate variability η, starting point variability sz and the four drift rates v1, v2, v3, v4. In addition,the estimated Ter and st (mean and width of the uniform non-decision time distribution) for the classical analysisare shown. The rightmost figure contains the true (dashed line) and estimated (solid line) non-decision time pdfbased on the D*M analysis.

Page 17: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

FACTORING OUT NON-DECISION TIME IN CHOICE RT DATA 3

0.07

0.08

0.09

0.1

a0.4

0.45

0.5

0.55

0.6

zr0

0.05

0.1

0.15

0.2

η sz

0

0.2

0.4

0.6

0.8

v1

v2

v3

v4

0

0.1

0.2

0.3

0.4

Ter st

RT (s)0.2 0.4 0.6

dens

ity

0

5

10

15

(a) recovery diffusion model parameters for 106 observations per condition

0.07

0.08

0.09

0.1

a0.4

0.45

0.5

0.55

0.6

zr0

0.05

0.1

0.15

0.2

η sz

3

0

0.2

0.4

0.6

0.8

v1

v2

v3

v4

0

0.1

0.2

0.3

0.4

Ter st

RT (s)0.2 0.4 0.6

dens

ity

0

5

10

15

(b) recovery diffusion model parameters for 1000 observations per condition

0.07

0.08

0.09

0.1

a0.4

0.45

0.5

0.55

0.6

zr0

0.05

0.1

0.15

0.2 10

η sz

7

0

0.2

0.4

0.6

0.8

v1

v2

v3

v4

0

0.1

0.2

0.3

0.4

Ter st

RT (s)0.2 0.4 0.6

dens

ity

0

5

10

15

(c) recovery diffusion model parameters for 300 observations per conditionFigure 2 . Recovery results for a uniform non-decision pdf scenario. The three panels (a), (b), and (c) refer to threedifferent sample sizes: 106, 1000 and 300 per condition. Each panel shows the estimates of the diffusion modelparameters common to both the classical analysis (orange) and the D*M analysis (blue): boundary separation a,relative bias zr, drift rate variability η, starting point variability sz and the four drift rates v1, v2, v3, v4. In addition,the estimated Ter and st (mean and width of the uniform non-decision time distribution) for the classical analysisare shown. The rightmost figure contains the true (dashed line) and estimated (solid line) non-decision time pdfbased on the D*M analysis.

Page 18: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

FACTORING OUT NON-DECISION TIME IN CHOICE RT DATA 4

0.07

0.08

0.09

0.1

a0.4

0.45

0.5

0.55

0.6

zr0

0.05

0.1

0.15

0.2

η sz

0

0.2

0.4

0.6

0.8

v1

v2

v3

v4

0

0.1

0.2

0.3

0.4

Ter st

RT (s)0 0.5

dens

ity

0

2

4

6

(a) recovery diffusion model parameters for 106 observations per condition

0.07

0.08

0.09

0.1

a0.4

0.45

0.5

0.55

0.6

zr0

0.05

0.1

0.15

0.2

η sz

4

0

0.2

0.4

0.6

0.8

v1

v2

v3

v4

0

0.1

0.2

0.3

0.4

Ter st

RT (s)0 0.5

dens

ity

0

2

4

6

8

(b) recovery diffusion model parameters for 1000 observations per condition

0.07

0.08

0.09

0.1

a0.4

0.45

0.5

0.55

0.6

zr0

0.05

0.1

0.15

0.2 3

η sz

7

0

0.2

0.4

0.6

0.8

v1

v2

v3

v4

0

0.1

0.2

0.3

0.4

Ter st

RT (s)0 0.5

dens

ity

0

2

4

6

8

(c) recovery diffusion model parameters for 300 observations per conditionFigure 3 . Recovery results for a bimodal non-decision pdf scenario. The three panels (a), (b), and (c) refer to threedifferent sample sizes: 106, 1000 and 300 per condition. Each panel shows the estimates of the diffusion modelparameters common to both the classical analysis (orange) and the D*M analysis (blue): boundary separation a,relative bias zr, drift rate variability η, starting point variability sz and the four drift rates v1, v2, v3, v4. In addition,the estimated Ter and st (mean and width of the uniform non-decision time distribution) for the classical analysisare shown. The rightmost figure contains the true (dashed line) and estimated (solid line) non-decision time pdfbased on the D*M analysis.

Page 19: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

FACTORING OUT NON-DECISION TIME IN CHOICE RT DATA 5

Application 1: A diffusion application of choice bias

In Figure 4 we present the full set of results (for all parameters) of the traditional and D*M diffusion modelfits to the data from Mulder, Wagenmakers, Ratcliff, Boekel, and Forstmann (2012).

0.1

0.12

0.14

0.16

0.18

a

0

0.2

0.4

0.6

sz η v

-0.1

0

0.1

0.2

0.3

∆zr ∆v

0.2

0.4

0.6

0.8

Ter st

(a) traditional estimates

0.1

0.12

0.14

0.16

0.18

a

0

0.2

0.4

0.6

sz η v

-0.1

0

0.1

0.2

0.3

∆zr ∆v

(b) D*M estimates

RT (s)0 0.5 1 1.5

0.2

0

0.2 prior r(t)

reward r(t)

(c) non-decision pdfsprior r(t) (s)

0.5 1

rew

ard

r(t)

(s)

0.4

0.6

0.8

1

(d) Q-Q plot non-decision pdfsFigure 4 . Application 1: Parameter estimates of a diffusion model analysis of choice bias for 20 participants. Thefirst two panels show the estimates of all decision model parameters (boundary separation a, inter-trial variabilityof bias sz, inter-trial variability of drift rate η and effects of bias on starting point ∆zr and drift rate ∆v ), obtainedwith either the traditional (a, in orange, with the additional non-decision time parameters mean non-decision timeTer and inter-trial variability of non-decision time st) or D*M method (b, in blue). For each of the two panels,the darker box plots show the estimates for the elevated prior likelihood condition, the lighter plots for the largerpotential pay-off condition. If for the group of participants ∆zr or ∆v is significantly different from 0 (two-sidedsign test), this is indicated with a green (p < 0.001) or light green (p < 0.01) marking of the label. Panel (c)shows the non-decision time densities inferred from the D*M estimates in panel (b). The densities for the elevatedlikelihood condition are shown in the upper half of the plot and those for the larger potential pay-off condition areshown, mirrored, in the lower half. The solid lines show the mean non-decision pdfs across participants, the lighterareas display the double standard error interval. Panel (d) is a quantile-quantile plot of the data in panel (c), andis better suited to look at the differences between the non-decision pdfs from the two conditions. The grey arearepresents a 95% confidence interval of the mean quantile-quantile values (black crosses).

Page 20: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

FACTORING OUT NON-DECISION TIME IN CHOICE RT DATA 6

Application 2: A diffusion application of post-error slowing

In the figures below, we present the full set of results (for all parameters) of the traditional (Figures 5,7) andD*M (Figure 6) diffusion model fits to the data from Dutilh et al. (2011).

0.1

0.15

0.2

a

0.2

0.4

0.6

0.8

zrpW

zrpNW

0

0.05

0.1

0.15

szpW

szpNW

0.2

0.4

0.6

ηW

ηNW

-0.5

0

0.5

1

v1

v2

v3

v4

v5

v6

vNW

(a) decision model estimates

0

0.02

0.04

a

-0.05

0

0.05

0.1

0.15

zrpW

zrpNW

-0.05

0

0.05

szpW

szpNW

-0.2

0

0.2

0.4

ηW

ηNW

-0.2

0

0.2

0.4

v1

v2

v3

v4

v5

v6

vNW

(b) decision model differences

0.4

0.5

0.6

Ter1

Ter2

Ter3

Ter4

Ter5

Ter6

TerNW

0

0.1

0.2

0.3

0.4

st

(c) non-decision estimates

-0.04

-0.02

0

0.02

0.04

Ter1

Ter2

Ter3

Ter4

Ter5

Ter6

TerNW

-0.1

-0.05

0

0.05

st

(d) non-decision differencesFigure 5 . Application 2: Estimation results of a traditional diffusion model analysis of post-error slowing for 39participants. Panel (a) shows the traditional estimates for boundary separation a, relative bias after a word ornon-word error trial (zrpW and zrpNW ), inter-trial variability of bias after a word or non-word error trial (szpW

and szpNW ), inter-trial variability of drift rate for words and non-words (ηW and ηNW ), and the drift rates for sixdifferent word types and non-words (v1, ..., v6 and vNW ), for both the post-correct condition (darker box plots) andthe post-error condition (lighter box plots). Panel (b) shows the within-person differences between post-error andpost-correct conditions of the parameters in panel (a). Statistically significant effects for the differences (two-sidedsign test) are indicated with a green/red (p < 0.001) or light green/light red (p < 0.01) marking of the label (greenmeans a positive effect or a larger value post-error compared to post-correct, red a negative effect or a smaller valuepost-error compared to post-correct). Panel (c) shows the estimates of the uniform non-decision time distributions:the mean per stimulus type (Ter1, Ter2, Ter3, Ter4, Ter5, Ter6 and TerNW ) and the common width st. Panel (d) showsthe within-person differences between post-error and post-correct conditions of the parameters in panel (c).

Page 21: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

FACTORING OUT NON-DECISION TIME IN CHOICE RT DATA 7

0.1

0.15

0.2

a

0.2

0.4

0.6

0.8

zrpW

zrpNW

0

0.05

0.1

0.15

szpW

szpNW

0.20.40.60.8

11.21.4

ηW

ηNW

-1

0

1

v1

v2

v3

v4

v5

v6

vNW

(a) decision model estimates

0

0.05

0.1

a

-0.1

0

0.1

0.2

0.3

zrpW

zrpNW

-0.1

0

0.1

0.2

szpW

szpNW

-1

-0.5

0

0.5

1

ηW

ηNW

-1

0

1

v1

v2

v3

v4

v5

v6

vNW

(b) decision model differences

RT (s)0 0.5 1 1.5

0.2

0

0.2

post-correct r(t)

post-error r(t)

(c) non-decision pdfspost-correct r(t) (s)0.4 0.6 0.8

post

-err

or r

(t)

(s)

0.4

0.6

0.8

(d) Q-Q plot non-decision pdfsFigure 6 . Application 2: Estimation results of a D*M diffusion model analysis of post-error slowing for 39 par-ticipants. Panel (a) shows the D*M estimates for boundary separation a, relative bias after a word or non-worderror trial (zrpW and zrpNW ), inter-trial variability of bias after a word or non-word error trial (szpW and szpNW ),inter-trial variability of drift rate for words and non-words (ηW and ηNW ), and the drift rates for six different wordtypes and non-words (v1, ..., v6 and vNW ), for both the post-correct condition (darker box plots) and the post-errorcondition (lighter box plots). Panel (b) shows the within-person differences between post-error and post-correctconditions of the parameters in panel (a). Statistically significant effects for the differences (two-sided sign test)are indicated with a green/red (p < 0.001) or light green/light red (p < 0.01) marking of the label (green meansa positive effect or a larger value post-error compared to post-correct, red a negative effect or a smaller valuepost-error compared to post-correct). Panel (c) shows the participant averaged non-decision time densities inferredfrom the D*M estimates, separately for all six different word types (black) and non-words (red). The non-decisiontime densities of the post-correct condition are shown in the upper half, those of the post-error condition are shown,mirorred, in the lower half. Panel (d) is a quantile-quantile plot of the data in panel (c), and is better suited tolook at the differences between the non-decision time densities from the two conditions. (Because of the manynon-decision time pdfs, no confidence intervals are shown in panels (c) and (d).)

Page 22: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

FACTORING OUT NON-DECISION TIME IN CHOICE RT DATA 8

0.1

0.15

0.2

a

0.2

0.4

0.6

0.8

zrpW

zrpNW

0

0.05

0.1

0.15

szpW

szpNW

0.1

0.2

0.3

0.4

ηW

ηNW

-0.4

-0.2

0

0.2

0.4

0.6

0.8

v1

v2

v3

v4

v5

v6

vNW

(a) decision model estimates

0

0.02

0.04

0.06

a

-0.05

0

0.05

0.1

0.15

zrpW

zrpNW

-0.1

0

0.1

szpW

szpNW

-0.1

0

0.1

ηW

ηNW

-0.2

0

0.2

v1

v2

v3

v4

v5

v6

vNW

(b) decision model differences

0.4

0.5

0.6

Ter1

Ter2

Ter3

Ter4

Ter5

Ter6

TerNW

0

0.1

0.2

0.3

0.4

st1

st2

st3

st4

st5

st6

stNW

(c) non-decision model estimates

-0.05

0

0.05

Ter1

Ter2

Ter3

Ter4

Ter5

Ter6

TerNW

-0.1

0

0.1

st1

st2

st3

st4

st5

st6

stNW

(d) non-decision model differencesFigure 7 . Application 2: Estimation results of a traditional diffusion model analysis of post-error slowing for 39participants, with separate inter-trial variabilities of non-decision time for different stimulus types. Panel (a) showsthe traditional estimates for boundary separation a, relative bias after a word or non-word error trial (zrpW andzrpNW ), inter-trial variability of bias after a word or non-word error trial (szpW and szpNW ), inter-trial variabilityof drift rate for words and non-words (ηW and ηNW ), and the drift rates for six different word types and non-words(v1, ..., v6 and vNW ), for both the post-correct condition (darker box plots) and the post-error condition (lighterbox plots). Panel (b) shows the within-person differences between post-error and post-correct conditions of theparameters in panel (a). Statistically significant effects for the differences (two-sided signtest) are indicated witha green/red (p < 0.001) or light green/light red (p < 0.01) marking of the label (green means a positive effect ora larger value post-error compared to post-correct, red a negative effect or a smaller value post-error comparedto post-correct). Panel (c) shows the estimates of the uniform non-decision time distributions: the mean perstimulus type (Ter1, Ter2, Ter3, Ter4, Ter5, Ter6 and TerNW ) and the width per stimulus type (st1, st2, st3, st4, st5, st6and stNW ). Panel (d) shows the within-person differences between post-error and post-correct conditions of theparameters in panel (c).

Page 23: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

FACTORING OUT NON-DECISION TIME IN CHOICE RT DATA 9

Application 3: A diffusion application of task switching costs

In the figures below, we present the full set of results (for all parameters) of the traditional and D*M diffusionmodel fits to the data from Schmitz and Voss (2012). In Figure 8, task-switching trials are compared to pure tasktrials; in Figure 8, task-repeating trials are compared to pure task trials.

0.1

0.2

0.3

a

0

0.2

0.4

0.6

η sz

0

0.5

1

1.5

v

0

0.2

0.4

0.6

Ter st

(a) traditional estimates

0

0.2

0.4

0.6

a

-0.5

0

0.5

η sz

-1

-0.5

0

0.5

1

v

-0.2

0

0.2

0.4

0.6

Ter st

(b) traditional differences

0.1

0.2

0.3

a

0

0.2

0.4

0.6

η sz

0

0.5

1

1.5

v

(c) D*M estimates

0

0.05

0.1

a

-0.2

0

0.2

η sz

-0.2

0

0.2

v

(d) D*M differences

RT (s)0 0.5 1

0.4

0.2

0

0.2

0.4 pure task r(t)

task-swiching r(t)

(e) non-decision pdfspure task r(t) (s)

0.4 0.6

task

-sw

itchi

ng r

(t)

(s)

0.3

0.4

0.5

0.6

(f) Q-Q plot non-decision pdfsFigure 8 . Application 3: Estimation results of a diffusion model analysis of task switching costs for 24 participants. Panel(a) shows the traditional estimates of all diffusion model parameters (boundary separation a, inter-trial variability of driftrate η, inter-trial variability of bias sz, drift rate v, uniform non-decision time distribution mean Ter and width st), for boththe task-switching trials (lighter box plots) and the pure task trials (darker box plots). Panel (b) shows the within-persondifferences between task-switching and pure task conditions of the parameters in panel (a). Statistically significant effects forthe differences (two-sided sign test) are indicated with a green/red (p < 0.001) or light green/ light red (p < 0.01) indicator(green means a larger parameter value in the task switch condition compared a pure task condition, red means smallervalues). Panels (c) and (d) are the respective D*M versions of panels (a) and (b). Panel (e) shows the participant averagednon-decision time densities inferred from the D*M estimates. The non-decision time densities of the pure task condition areshown in the upper half of the plot, those of the task-switching trials are shown, mirrored, in the lower half. The solid linesshow the mean non-decision pdfs across participants, the lighter areas display the double standard error interval. Panel (f)is a quantile-quantile plot of the data in panel (e), and is better suited to look at the differences between the non-decisionpdfs from the two conditions. The grey area represents a 95% confidence interval of the mean quantile-quantile values (blackcrosses).

Page 24: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

FACTORING OUT NON-DECISION TIME IN CHOICE RT DATA 10

0.1

0.2

0.3

a

0

0.2

0.4

0.6

η sz

0

0.5

1

1.5

v

0

0.2

0.4

0.6

Ter st

(a) traditional estimates

0

0.2

0.4

a

0

0.5

1

η sz

-0.5

0

0.5

1

1.5

v

-0.2

-0.1

0

0.1

0.2

Ter st

(b) traditional differences

0.1

0.2

0.3

a

0

0.2

0.4

0.6

η sz

0

0.5

1

1.5

v

(c) D*M estimates

-0.02

0

0.02

0.04

0.06

0.08

a

0

0.2

0.4

η sz

0

0.2

0.4

0.6

v

(d) D*M differences

RT (s)0 0.5 1

0.4

0.2

0

0.2

0.4 pure task r(t)

task-repeating r(t)

(e) non-decision pdfspure task r(t) (s)0.3 0.4 0.5

task

-rep

eatin

g r(

t) (

s)

0.3

0.4

0.5

(f) Q-Q plot non-decision pdfsFigure 9 . Application 3: Estimation results of a diffusion model analysis of task switching costs for 24 participants.Panel (a) shows the traditional estimates of all diffusion model parameters (boundary separation a, inter-trialvariability of drift rate η, inter-trial variability of bias sz, drift rate v, uniform non-decision time distribution meanTer and width st), for both the task-repeating trials (lighter box plots) and the pure task trials (darker box plots).Panel (b) shows the within-person differences between task-switching and pure task conditions of the parametersin panel (a). Statistically significant effects for the differences (two-sided sign test) are indicated with a green/red(p < 0.001) or light green/ light red (p < 0.01) indicator (green means a larger parameter value in the task switchcondition compared a pure task condition, red means smaller values). Panels (c) and (d) are the respective D*Mversions of panels (a) and (b). Panel (e) shows the participant averaged non-decision time densities inferred from theD*M estimates. The non-decision time densities of the pure task condition are shown in the upper half of the plot,those of the task-switching trials are shown, mirrored, in the lower half. The solid lines show the mean non-decisionpdfs across participants, the lighter areas display the double standard error interval. Panel (f) is a quantile-quantileplot of the data in panel (e), and is better suited to look at the differences between the non-decision pdfs fromthe two conditions. The grey area represents a 95% confidence interval of the mean quantile-quantile values (blackcrosses).

Page 25: To cite this article: Verdonck, S., & Tuerlinckx, F. (2016 ... · To cite this article: Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time

FACTORING OUT NON-DECISION TIME IN CHOICE RT DATA 11

References

Dutilh, G., Vandekerckhove, J., Forstmann, B. U., Keuleers, E., Brysbaert, M., & Wagenmakers, E.-J. (2011,November). Testing theories of post-error slowing. Attention, Perception, & Psychophysics, 74 (2), 454–465.doi: 10.3758/s13414-011-0243-2

Mulder, M. J., Wagenmakers, E.-J., Ratcliff, R., Boekel, W., & Forstmann, B. U. (2012, February). Bias in thebrain: A diffusion model analysis of prior probability and potential payoff. The Journal of Neuroscience,32 (7), 2335–2343. doi: 10.1523/JNEUROSCI.4156-11.2012

Schmitz, F., & Voss, A. (2012). Decomposing task-switching costs with the diffusion model. Journal of ExperimentalPsychology: Human Perception and Performance, 38 (1), 222–250. doi: 10.1037/a0026003