Murray Et Al Mol Psych 2007

download Murray Et Al Mol Psych 2007

of 10

description

NEURO

Transcript of Murray Et Al Mol Psych 2007

  • ORIGINAL ARTICLE

    Substantia nigra/ventral tegmental reward prediction errordisruption in psychosisGK Murray1,2,3, PR Corlett1,3, L Clark3, M Pessiglione4, AD Blackwell1, G Honey1,3, PB Jones1,2,

    ET Bullmore1,2,3, TW Robbins3 and PC Fletcher1,3

    1Brain Mapping Unit, Department of Psychiatry, Addenbrookes Hospital, University of Cambridge, Cambridge, UK; 2CAMEO,Cambridgeshire and Peterborough Mental Health Partnership NHS Trust, Cambridge, UK; 3Behavioural and ClinicalNeuroscience Institute, Cambridge, UK and 4Hopital de la Salpetrie`re, Pavillon Claude Bernard, 47 Bd de lHopital,Paris, France

    While dopamine systems have been implicated in the pathophysiology of schizophrenia andpsychosis for many years, how dopamine dysfunction generates psychotic symptoms remainsunknown. Recent theoretical interest has been directed at relating the known role of midbraindopamine neurons in reinforcement learning, motivational salience and prediction error toexplain the abnormal mental experience of psychosis. However, this theoretical model has yetto be explored empirically. To examine a link between psychotic experience, reward learningand dysfunction of the dopaminergic midbrain and associated target regions, we asked agroup of first episode psychosis patients suffering from active positive symptoms and a groupof healthy control participants to perform an instrumental reward conditioning experiment. Wecharacterized neural responses using functional magnetic resonance imaging. We observedthat patients with psychosis exhibit abnormal physiological responses associated with rewardprediction error in the dopaminergic midbrain, striatum and limbic system, and wedemonstrated subtle abnormalities in the ability of psychosis patients to discriminate betweenmotivationally salient and neutral stimuli. This study provides the first evidence linkingabnormal mesolimbic activity, reward learning and psychosis.Molecular Psychiatry advance online publication, 7 August 2007; doi:10.1038/sj.mp.4002058

    Keywords: fMRI; schizophrenia; reinforcement learning; reward; dopamine; incentive salience

    Introduction

    Why does a biochemical disturbance in brain dopa-mine systems lead to delusional ideas and otherphenomena of psychosis? Psychotic symptoms arethought to be caused by disturbance in the function ofthe mesolimbic dopamine system:1,2 it is establishedthat administration of dopaminergic drugs can causepsychosis in healthy individuals,3,4 that patients withschizophrenia show abnormal striatal dopaminergicresponses to amphetamine challenge,5,6 and thatdopamine D2 receptor blockade is critical in reducingpsychotic experiences such as delusions and halluci-nations.7 Yet there remains an explanatory gapbetween what we understand about the neurobiologyof psychosis and what we understand about itssubjective experience.

    There have been attempts to bridge this gap,811

    although until recently the normal function of themesolimbic dopamine system may have been insuffi-

    ciently understood to explain the psychologicalconsequences of its dysfunction. However, recentevidence has demonstrated that dopamine neuronsthat extend from the tegmental midbrain to theventral striatum code reward prediction error andthus serve as an important teaching signal by whichanimals can learn about stimulus-outcome associa-tions.12,13 Further evidence indicates that subcorticaldopamine contributes causally to the attribution ofincentive salience, the process by which a stimulusgrabs attention and motivates goal-directed behaviourbecause of associations with reward or punish-ment.1417 Given that theories of delusion formationemphasize the emergence of abnormal associations asthe progenitors of irrational beliefs,18 this work hasprovided a new theoretical framework within whichto consider the neurobiology of psychosis. It has beenproposed that dysregulated midbrain dopamine neu-ron firing could result in an individual maladaptivelyattributing importance to innocuous stimuli orevents, that is experiencing abnormal referentialideas.10,11,19,20 At present, this conceptualization ofpsychosis remains largely theoretical, yet it implies anumber of predictions that can be tested empirically.In particular, it predicts that patients with psychosiswould show impaired ability to distinguish, both in

    Received 21 February 2007; revised 23 May 2007; accepted 18June 2007

    Correspondence: Dr GK Murray, Brain Mapping Unit, Departmentof Psychiatry, University of Cambridge, Addenbrookes Hospital,Box 255, Cambridge CB2 2QQ, UK.E-mail: [email protected]

    Molecular Psychiatry (2007), 110& 2007 Nature Publishing Group All rights reserved 1359-4184/07 $30.00

    www.nature.com/mp

  • terms of their neurophysiological responses in themidbrain and ventral striatum and in their overtbehaviour, between stimuli that high and low inmotivational salience.

    To determine whether psychotic experiences occurin the context of dysfunction of the dopaminergicmidbrain, and to establish a link between psychoticexperiences, the mesolimbic system and rewardprocessing, we asked a group of patients experiencingactive psychotic symptoms and a group of healthycontrol participants to perform an instrumentalreward conditioning experiment (similar to ODoh-erty et al.21). We characterized mesolimbic responsesusing fMRI; we applied a standard action-valuelearning computational model to subjects behaviour-al choices22 and used the ensuing values of rewardprediction errors over the course of the experiment asindividual-specific regressors in the image analysis.23

    In doing so, we were able to establish the relationshipbetween reward prediction error and mesolimbicactivity in healthy and psychotic individuals. Wepredicted that behavioural data would demonstrateimpaired ability of psychosis patients to discriminatebetween rewarding and neutral stimuli, and that theirmidbrain and ventral striatal physiological responsesassociated with reward prediction errors would becorrespondingly disturbed.

    Materials and methods

    SubjectsThe study was approved by the local research ethicscommittee. Thirteen individuals (nine men) withcurrent positive psychotic symptoms were recruitedfrom the Cambridge first-episode psychosis service,CAMEO. Study inclusion criteria were (1) agebetween 17 and 35 years and (2) current psychoticsymptoms as reflected by the presence of delusions orhallucinations. Twelve healthy volunteers (nine men)were recruited as control subjects, matched in age,gender, handedness and estimated premorbid IQ asmeasured using the National Adult Reading Test.24

    After complete description of the study to theparticipants, written informed consent was obtained.Telephone screening interview followed by interviewin person ascertained that control subjects werewithout a history of psychiatric illness, physicalillness, head injury, drug or alcohol dependence.Both patient and control subjects were withoutcontraindications for fMRI scanning. Five of the 13patients were not taking antipsychotic medication;the other 8 were taking atypical antipsychoticmedication (of these 8, the median duration oftreatment was 2 months, and the mean chlorproma-zine equivalent dose was 181770 mg/day,25). Themean ages were 26 years (s.d. 3 years) for both groups;mean NART scores were 116 (5) for controls, 113 (11)for patients. Twelve months following data collectiona psychiatrist (GM) assigned DSM-IV diagnoses topatients using all available clinical information,including case-note review and structured clinical

    interview for DSM-IV: one patient met criteria forbipolar disorder, one psychosis not otherwise speci-fied and the other eleven schizophrenia. Patients hadpredominantly positive symptoms compared to nega-tive symptoms at the time of scanning; the mean scoreof Brief Psychiatric Rating Scale (BPRS)26 hallucina-tions, unusual thought content and suspiciousnesswas 3.9 (moderate severity), while the mean score ofBPRS self-neglect, blunted affect and emotionalwithdrawal was 1.9 (very mild severity).

    Reward learning task

    Subjects performed an instrumental learning taskinvolving monetary gains that required choosingbetween two visual stimuli displayed on a computerscreen, so as to maximize payoffs (see Figure 1,Supplementary Figure 1 and Participant Instructionsin Supplementary material). On each trial, theparticipant chose one of the two stimuli on thescreen, and feedback was either provided or not in aprobabilistic manner. The 160 trials were divided intotwo trial types, randomly interspersed: reward andneutral, each involving a different pair of stimuli. Thereward stimulus pair was potentially associated withrewarding feedback (20 pence or no feedback),whereas the neutral stimulus pair was associatedwith no financial outcomes (there would either befeedback of a neutral image about the same size as a20 pence coin or no feedback). The feedback wasprobabilistic: each trial type had a high probabilitystimulus (which gave feedback on 60% of occasions)and a low probability stimulus (feedback on 30% ofoccasions). Therefore, to win money participants hadto learn, by trial and error, to select the stimulus thatwas more likely to produce a reward (see participant

    Figure 1 Experimental task. Subjects select either of twovisual stimuli presented on a display screen, and subse-quently observe the outcomeeither a financial reward of20 pence (shown on the top right of the figure), or neutralfeedback (not shown here but shown in SupplementaryFigure 1), or nothing.

    Midbrain reward prediction error disruption in psychosisGK Murray et al

    2

    Molecular Psychiatry

  • instructions, Supplementary material). Participantswere not explicitly informed that one pair of stimulisignalled the potential for a reward, and that the othersignalled the potential for neutral feedback; ratherthey learnt this over the course of the experiment.Participants were also unaware of the fact that on anygiven trial, the probability of their receiving feedbackif they chose the high probability stimulus (60%) wasindependent of the probability of their receivingfeedback if they chose the low probability stimulus(30%). Stimuli were variously coloured blocks; therelationship of a given block to feedback was counter-balanced across subjects. Stimulus selection was bybutton press (left or right). Participants were informedthat any money they won in the experiment would bepaid to them in cash at the end of the experiment.

    Behavioural analysisA mixed model analysis of variance (ANOVA) wasused to assess effects of Valence (Reward or Neutral),and Diagnosis (Psychosis or Control) on the propor-tion of high-probability stimuli selected (after arcsinetransformations to enable parametric analysis). Pre-vious studies have indicated that, on trials wherethere is a potential for reward, reaction times arefaster than in trials where there will be no re-ward,21,23,27 reflecting increased motivation to obtainrewards. We therefore performed a further ANOVA,this time using mean reaction time as the dependentvariable.

    Rating scalesA psychiatrist (GM) interviewed participants directlyfollowing the scanning session, and rated psycho-pathology on the Brief Psychiatric Rating Scale.26 Toapproximate the value placed on the reward byparticipants, we asked participants to rate the amountof money they earned on a scale of 15 as an amountin relation to the amount of time spent, and on aseparate scale as an absolute amount (also 15). Thesescores were then summed to create an overall valuemeasure. In addition, we asked, using a visualanalogue scale: if you see 20 pence lying on thestreet, how likely are you to pick it up?

    Computational modelWe fitted a standard reinforcement learning algorithmto each subjects sequence of choices. We used a basicQ learning algorithm, which has been shown pre-viously to offer a good account of instrumental choicein both humans and primates.22 For each pair ofstimuli A and B, the model estimates the expectedvalues of choosing A(Qa) and choosing B(Qb), on thebasis of individual sequences of choices and outcomes.This value, termed a Q value, is essentially theexpected reward obtained by taking that particularaction. These Q values were set at zero before learning,and after every trial t > 0 the value of the chosenstimulus (say A) was updated according to the rule

    Qat 1 Qat a dt

    The prediction error was

    dt Rt Qatwhere R(t) is defined as the reinforcement obtained asan outcome of choosing A at trial t. In other words,the prediction error d (t) is the difference betweenthe expected outcome (that is, Q(t)) and the actualoutcome (that is, R(t)). The reinforcement magnitudeR was 1 for feedback and 0 for nothing outcomes.Given the Q values, the associated probability ofselecting each action was estimated by implementingthe softmax rule, for example, for choosing A,

    Pat eQat=b

    eQat=b eQbt=b

    This is a standard stochastic decision rule thatcalculates the probability of taking one of a set ofactions according to their associated values. Theconstants a (learning rate) and b (temperature) wereadjusted to maximize the probability (or likelihood) ofthe actual choices under the model. To compare theaccuracy of fit between diagnoses and conditions, weused negative log likelihood, which can be summedacross trials, sessions and subjects. The learningmodel was fitted with a single set of parametersacross all subjects in both groups, since for ourimaging analysis we test the null hypothesis thatthere is no difference between groups.23 It was thenused to create a statistical regressor corresponding tothe modelled outcome prediction error in the imagingdata. For additional (purely behavioural) analysis, weestimated the model parameters a and b for eachindividual participant, and tested whether thesediffered across groups.

    fMRI Data Acquisition and analysisA Bruker MedSpec 30/100 (Ettlingen, Germany)operating at 3 T was used to collect imaging data.Gradient-echo echo planar T2*-weighted echo planarimages depicting BOLD contrast were acquired from21 non-contiguous near axial planes: TR = 1.1 s,TE = 27.5 ms, flip angle = 661, in-plane resolu-tion = 3.1 3.1 mm, matrix size 64 64, field of view2020 cm, bandwidth 100 kHz. A total of 750volumes per subject were acquired (21 slices each of4 mm thickness, interslice gap 1 mm). The first sixvolumes were discarded to allow for T1 equilibrationeffects. fMRI data were analysed using statisticalparametric mapping in the SPM2 programme (Well-come Department of Cognitive Neurology, London,UK). Images were realigned, spatially normalized to astandard template and spatially smoothed with aGaussian kernel (6 mm at full-width half-maximum).The time series in each session were high-passfiltered (to a maximum of 1/120 Hz) and serialautocorrelations were estimated using an AR(1)model.

    We used a single statistical linear regression modelfor all our analyses as follows. Each trial was

    Midbrain reward prediction error disruption in psychosisGK Murray et al

    3

    Molecular Psychiatry

  • modelled as a delta function set at the time of thefeedback display. Separate regressors were created forreward and neutral trials. Prediction errors generatedby the Q learning model were then used as parametricmodulators of these regressors. All regressors ofinterest were convolved with a canonical haemody-namic response function with a temporal derivative.28

    Linear contrasts of regression coefficients were com-puted at the individual subject level and then taken toa group level random effects analysis of variance.We carried out the following contrasts:

    1. Main effect of reward prediction error, irrespectiveof diagnosis (prediction error on reward trialsversus prediction error on neutral trials). Thisindicated regions where, in the group as a whole(controls plus patients), there was a significantrelationship between prediction error and event-related brain response to reward trials compared toneutral trials.

    2. Within controls analysis. Prediction error onreward trials versus prediction error on neutraltrials.

    3. Within patient group analysis: prediction error onreward trials versus prediction error on neutraltrials.

    4. Between group analysis: prediction error on rewardtrials versus prediction error on neutral trials. Thisanalysis indicated regions in which there was ananomalous relationship between prediction errorand brain response (in reward compared to neutraltrials) in the patient group, with either exaggeratedor diminished effects in patients.

    5. Between-group comparison. Unmedicated patientsversus controls and effects of medication. To showthat differences in (4) were not secondary tomedication effects, we repeated the casecontrolcomparison having excluded the eight medicatedpatients (leaving 12 controls and 5 patients). Wealso examined, within medicated patients, thecorrelation between brain activation (reward pre-diction error versus neutral prediction error) andmedication dose in chlorpromazine equivalents ata relaxed threshold of P < 0.1 false discovery rate(FDR)-corrected.

    6. Between-group comparison. Patients taking anti-psychotic medication versus controls. To show thatthe differences in (4) were not solely driven byunmedicated patients, we performed a comparisonbetween the eight patients taking antipsychoticmedication and the controls.

    7. Finally, we investigated whether midbrain fMRIparameter estimates (reward prediction error ver-sus neutral predication error) were correlated withBPRS-positive symptom score (sum of BPRShallucinations, unusual thought content and sus-piciousness).

    We performed these analyses in an a priorihypothesized region of interest, and in the wholebrain. Significance level for activation was set at aFDR of P < 0.05.29 For the a priori region of interest,

    activations were considered significant at P < 0.05corrected using appropriate small volume correctionsfor the location of predicted peaks. The region ofinterest comprised the union of a midbrain andventral striatal region (see Figure 3D). The midbrainregion was a sphere of radius 15 mm centred at MNIcoordinates 0, 15, 9 [x, y, z], and encompassed theentire midbrain, including substantia nigra, ventraltegmental area (VTA) and other structures.30 Theventral striatal region was hand drawn in MRIcro31

    following the definition of ventral striatum byLaruelle et al.32 For the whole brain analyses, inaddition to the FDR threshold of P < 0.05, westipulated a further threshold of cluster size greaterthan 100 voxels. We have also reported results atlower thresholds in Supplementary Tables.

    Results

    Behavioural resultsThe ANOVA of behavioural choice showed a sig-nificant main effect of Valence: subjects chose thehigh probability stimulus more frequently on rewardtrials than neutral trials (F(1,23) = 22. 2, P < 0.001, seeFigure 2a). While controls chose the high probabilitystimulus on reward trails more frequently thanpatients, this difference was not significant: therewas no significant main effect of Diagnosis(F(1,23) = 1.04, P = 0.3) or Diagnosis by Valence Inter-action (F(1,23) = 1.6, P = 0.22). The ANOVA of re-sponse latency also confirmed a significant effect ofValence (F(1,23) = 41, P < 0.001) with faster reactiontimes on reward trials than on neutral trials (seeFigure 2b). In addition, there was a significantDiagnosis by Valence interaction (F(1,23) = 7.1,P = 0.014), as the difference between reward andneutral trials was less in patients compared tocontrols (t(23) = 2.6, P = 0.014), and the patients weresignificantly faster than controls on the neutral trials(t(23) = 3.3, P = 0.003). Response latencies stratified byhigh/low probability stimulus choice for each groupare presented in Supplementary Figures 2 (rewardtrials) and 3 (neutral trials).

    Patients and controls did not differ on financialratings (P = 0.32 on visual analogue rating of like-lihood of picking up a 20 pence coin in the street,P = 0.11 on experiment earning rating). When thecomputational model constants a (learning rate) and b(temperature) were adjusted to maximize the prob-ability (or likelihood) of the actual choices under themodel, we found a= 0.04, and b= 0.2 (see Supple-mentary Figure 4). There was no significant differencebetween patients and controls in goodness of fit ofthe computational model to behavioural choices(t(23) = 1.4, P = 0.17).

    In additional analysis of behavioural data, weestimated individual a and b parameters foreach participant (Supplementary Figures 5, 6);these did not differ significantly across groups (a:MannWhitney U = 77, P = 0.96; b: MannWhitneyU = 54, P = 0.15).

    Midbrain reward prediction error disruption in psychosisGK Murray et al

    4

    Molecular Psychiatry

  • Imaging results

    Entire sample. When both groups were analysedtogether, reward prediction error was associated withincreased activity, compared to neutral predictionerror, in the ventral striatum on whole brain analysis(P < 0.000001 uncorrected) and in the ventral striatumand midbrain on region of interest analysis (P < 0.05FDR-corrected). See Table 1, Figure 3 and Supple-mentary Table 1.

    Control subjects. In the control subjects, rewardprediction error was associated with activity in themidbrain, approximately localized to ventraltegmental and substantia nigra areas of dopamineneuron origin, in addition to several target regions ofdopamine neuron output: the striatum, cingulate andtemporal cortex (see Table 2, Supplementary Figure 7).

    Psychosis patients. In the psychosis patient group,no reward prediction error activations survivedcorrection for multiple comparison. However, ata reduced threshold (P < 0.005, uncorrected), we

    observed a small cluster of 12 voxels in the ventralstriatum and 11 voxels in the anterior cingulate cortexthat were active in the patient group for the contrastof prediction error: reward versus neutral (seeSupplementary Table 2).

    Casecontrol comparison. There were significantdifferences between cases and controls in bilateralmidbrain and right ventral striatum (Z = 2.76 at 22,20,10 [x, y, z]) on region of interest analysis (Figures4a and b). The differing midbrain activations betweenthe two groups were driven by a combination ofattenuated response to reward prediction error inpsychosis together with an augmented response toneutral prediction error in psychosis (see Figure 4c,and Supplementary Figure 9). In addition, on wholebrain analysis there were casecontrol differencesin bilateral midbrain and a number of limbicregions including hippocampus, insula and cingulatecortex in addition to putamen and ventral pal-lidum (P < 0.05, FDR-corrected. Table 3, Figure 4d,Supplementary Figure 8). The statistics we present arefrom two-tailed tests (that is, greater activity in

    80

    70

    60

    50

    40

    30

    Num

    ber o

    f Hig

    h Pr

    obab

    ility

    Choi

    ces

    Mea

    n R

    eact

    ion

    Tim

    e

    20

    10

    Controls Reward Controls Neutral Cases NeutralCases Reward

    Controls Reward Controls Neutral Cases NeutralCases Reward

    0

    1200

    1000

    800

    600

    400

    200

    0

    * *

    *

    *

    *

    Figure 2 Behavioural results. (a) Choice behaviour. Each group learnt to choose the high probability stimulus on rewardtrials, but there were no significant differences between groups. Error bars denote standard error of the mean and stars denotesignificant differences (P < 0.05). (b) Reaction time. The difference between reward and neutral trial latencies was less inpsychosis patients compared to controls (Diagnosis by Valence interaction: F = 7.1, d.f. = 1, 23, P = 0.014), and patientsresponded more rapidly than control subjects to neutral stimuli (T = 3.3, P = 0.003). Error bars denote standard error of themean and stars denote significant differences (P < 0.05).

    Midbrain reward prediction error disruption in psychosisGK Murray et al

    5

    Molecular Psychiatry

  • patients compared to controls or controls compared topatients), but we note there were no regions withgreater activation for these contrasts in psychosis.

    Casecontrol comparison with patients onantipsychotic medication excluded. To exclude thepossibility that the difference between patients andcontrols were secondary to medication effects, werepeated the casecontrol comparison with themedicated patients excluded. There were stillsignificant differences between cases and controls inbilateral midbrain on region of interest analysis, evenafter adjustment for multiple comparisons (Z = 4.64 at8, 20, 6 [x, y, z]; Z = 3.37 at 12, 22, 4 [x, y, z]).In the patients who were taking medication, there wasno relationship between brain reward predictionerrors and medication dose (chlorpromazine

    equivalents), either in the whole brain analysis orregion of interest at the relaxed threshold of P = 0.1(FDR-corrected).

    Casecontrol comparison with patients onantipsychotic medication only. Having establishedthat midbrain group differences were not secondaryto medication, we went on to test whether the groupdifferences were solely driven by unmedicatedpatients by comparing controls against patientstaking antipsychotics. On whole brain analysis,there were still bilateral midbrain significantdifferences, robust to correction for multiplecomparison, in addition to differences in variouslimbic regions (see Supplementary Table 3).

    Having established group differences in midbrainactivation between groups, we went on to examinewhether, within patients, the fMRI midbrain para-meter estimates correlated with the level of psychoticsymptoms. There was no significant correlation(r =0.23, P = 0.5).

    Discussion

    Our findings demonstrate abnormal responses toreward prediction error in the midbrain and keytarget regions (striatum, hippocampus, cingulate,insula) in patients with psychosis. They providedirect empirical support for a model of psychosis,which invokes abnormal dopamine-dependent moti-vational salience as a key underlying disturbance.While patients successfully learnt the required con-tingencies, suggesting that their abnormal brainresponses were not secondary to impaired taskperformance, these disrupted neural responses wereaccompanied by significant behavioural differences,notably, a tendency to show rapid reaction times evento stimuli that predicted neutral feedback. Previousreinforcement learning experiments using paradigmssimilar to ours have reported faster reaction times inresponse to rewarding stimuli than neutral stimuli:this phenomenon has been termed reinforcementrelated speeding.21,23,27 Such reinforcement relatedspeeding is attributed to the anticipation of apotential reward on such trials leading to enhancedmotivation and hence faster responding. In our study,both patients and controls were significantly faster on

    Figure 3 fMRI results for analysis in entire sample. (ac)Results of the contrast of prediction error on reward versusneutral trials in region of interest analysis. Effects signifi-cant at P < 0.05 FDR-corrected for multiple comparisons areshown in yellow and orange. (d) The a priori definedmesolimbic region of interest was composed of the union ofthe midbrain and ventral striatum, shown here in amaximum intensity projection.

    Table 1 Prediction error on reward trials contrasted with prediction error on neutral trials in entire sample

    Region Peak z-score X, Y, Z No. of voxels

    Left ventral striatum 4.09 16, 6, 12 279Right ventral striatum 3.76 12, 10, 12 266Right midbrain VTA/nigra 3.16 10, 8, 6 262Left midbrainVTA/nigra 2.79 4, 16, 6

    Abbreviation: VTA, ventral tegmental area.Significant activations in the midbrain/ventral striatal region of interest are shown; threshold P < 0.05, false discovery ratecorrected for multiple comparisons.

    Midbrain reward prediction error disruption in psychosisGK Murray et al

    6

    Molecular Psychiatry

  • reward trials than neutral trials, in accordance withprevious data, but the difference between latencies onreward and neutral trials was attenuated in patients.Patients were significantly faster than controls onneutral trials, consistent with the theory that theyfound such trials inappropriately motivationallysignificant. It is not unprecedented that psychosispatients perform rapidly on cognitive testsit hasbeen previously been shown that deluded patients arefaster than controls when making decisions duringprobabilistic reasoning tasks.33

    Our results suggest that, at the behavioural level,psychotic patients are failing to make the distinctionbetween events that are motivationally salient (that is,in this case, signalling a potential for reward) andthose that are not. This maladaptive behaviour isconsistent with their abnormal midbrain activations.Here, patients failed to show the normal differentialresponse to rewarding and neutral prediction errorrelated activity. In controls, the distinction wasreflected in the responses to a number of regionsmidbrain, striatum, cingulate, insulathat have beenpreviously implicated in reward processing in bothhuman30,34,35 and animal studies.13 Furthermore,reward processing/reward prediction error aremediated by dopamine in both humans23,36,37 andanimals.38 We suggest that the midbrain activations incontrols, and its aberration in individuals withpsychosis, is related to dopamine activity, thoughwe acknowledge that this experimental design onlyprovides indirect evidence in this regard.

    While the results from the neuroimaging analysisshow very striking differences between groups, thebehavioural differences were more subtle; this may

    Figure 4 Group differences. Regions in which there weregroup differences in the relationship between predictionerror and brain response (in reward compared to neutraltrials) are shown in yellow and red. (a) and (b) Region ofinterest analysis results in sagittal (a) and axial (b) sections,P < 0.05 FDR-corrected; (d) whole brain analysis results incoronal section (P < 0.05 FDR-corrected, cluster level 100).(c) Parameter estimates at 8, 22, 8 (right midbrain). Thediffering midbrain activations between the two groupsappeared to be driven by a combination of patientsattenuated responses to prediction error in reward trialstogether with patients augmented responses to predictionerror in neutral trials. Error bars denote standard error of themean.

    Table 2 Control group results

    Region Peak z-score X, Y, Z No. of voxels

    Left midbrain VTA/nigra 5.33 8, 20, 8 380Right midbrain VTA/nigra 4.41 14, 16, 6 300Right ventral striatum 4.56 16, 18, 10 337Left ventral striatum 4.23 20, 10, 12 332Left ventral pallidum 4.70 16, 6, 2 166Left insula 4.61 36, 10, 4 642Cingulate posterior 5.42 10, 42, 44 641Cingulate middle 4.15 12, 10, 36 480Right middle temporal gyrus 3.39 42, 76, 24 365Right precentral gyrus 5.24 24, 26, 72 111Left precentral gyrus 3.71 20, 16, 72 117Right medial frontal cortex 3.80 10, 46, 8 124Left medial frontal gyrus 3.27 10, 54, 4 167Right middle frontal gyrus 4.10 28, 38, 28 188Cerebellum 4.23 8, 54, 22 267Right visual cortex 4.46 32, 98, 10 730Left visual cortex 3.32 28, 98, 12 704

    Abbreviation: VTA, ventral tegmental area.Regions with significant activations on whole brain analysis for the contrast of prediction error on reward trials versusprediction error on neutral trials, P < 0.05 false discovery rate corrected for multiple comparisons, cluster level = 100 (seeSupplementary Figure 7).

    Midbrain reward prediction error disruption in psychosisGK Murray et al

    7

    Molecular Psychiatry

  • reflect the increased sensitivity of functional MRIcompared with behavioural analysis. In fact, controlschose the high probability stimulus more often thanpatients (this difference was not statistically signifi-cant). Perhaps, on a more difficult reward learningtest, there would have been more pronouncedbehavioural differences between groups in choicebehaviour; this area demands further empiricalinvestigation in future studies.

    Some of the patients were taking atypical antipsy-chotic dopamine receptor anatagonist medication.However, there are several reasons why the groupdifferences we observed are unlikely to be secondaryto medication: the midbrain VTA/substantia nigragroup differences remained significant when theanalysis was restricted to unmedicated patients; ouranalysis did not reveal any effect of medication onbrain activity in patients taking antipsychotics, and aprevious study by Juckel and colleagues39 providedevidence that atypical antipsychotics, rather thaninducing abnormal brain responses, in fact normalizephysiological responses to reward expectation inschizophrenia.

    Although several previous authors have hypothe-sized that dysfunctional dopamine-mediated reinfor-cement processing is implicated in the pathology ofpsychotic illnesses,10,11,19,4043 few empirical studieshave addressed the issue. To our knowledge, this isthe first study to examine brain reward prediction

    error in any psychiatric or neurological disorder. In areward anticipation task that robustly elicits ventralstriatal signal change, patients with schizophreniadisplayed abnormal ventral striatal activation com-pared with controls, though this study did not studylearning or examine prediction error.44 Previousbehavioural studies have demonstrated disturbancesin the classic dopamine-dependent associative learn-ing processes of Kamin blocking and latent inhibitionin early psychosis.45 More recent evidence for a modelof disrupted error-dependent learning in psychosiscomes from Corlett and colleagues,46 who showedthat right prefrontal prediction error signal duringcausal learning predicts subsequent vulnerability tothe psychotogenic effects of ketamine in healthyvolunteers. Our study provides subtle behaviouraland more prominent physiological evidence of re-inforcement learning abnormality in psychosis, apsychological process that, it is theorised, is impor-tant in both the positive and negative symptoms inschizophrenia and other psychotic disorders.

    Acknowledgments

    Graham Murray was supported by a Department ofHealth Research Capacity Development Award. PaulFletcher is supported by the Wellcome Trust. Thework was completed within the University of Cam-bridge Behavioural and Clinical Neuroscience Insti-

    Table 3 Group differences

    Region Peak z-score X, Y, Z No. of voxels

    Right midbrain VTA/nigra 5.34 8, 22, 8 300Left midbrain 3.68 10, 22, 2 380Left ventral pallidum/putamen 4.25 18, 2, 6 139Thalamus 4.59 14, 24, 4 447Right insula 4.66 52, 2, 8 603Left insula 4.41 36, 12, 2 721Cingulate middle 3.69 2, 22, 32 453Cingulate posterior 4.67 10, 34, 42 1207Right hippocampus/parahippocampal gyrus 4.55 36, 8, 18 170Left hippocampus 3.94 36, 10, 14 194Left superior temporal gyrus 4.76 48, 26, 6 1330Right superior temporal gyrus 4.61 50, 16, 0 1497Left middle temporal gyrus 4.29 58, 6, 18 113Right middle temporal gyrus 3.87 52, 58, 4 1256Orbito-frontal cortex 4.00 26, 14, 20 114Medial frontal lobe 3.52 10, 46, 30 109R middle frontal gyrus 3.73 36, 44, 14 183Left middle frontal gyrus 3.42 26, 40, 28 101Right inferior parietal lobule 4.11 64, 34, 38 289Right parietal lobe 3.35 14, 84, 48 187Cerebellum 3.79 6, 48, 20 383Calcalrine sulcus 3.77 20, 64, 10 621Right occipital lobe 4.06 30, 94, 14 1256Left occipital lobe 4.18 44, 80, 10 856

    Abbreviations: FDR, false discovery; VTA, ventral tegmental area.Reward prediction error contrasted with neutral prediction error: regions with significantly different activations betweengroups on whole brain analysis, P < 0.05, FDR-corrected, cluster level 100 (See Supplementary Figure 8).

    Midbrain reward prediction error disruption in psychosisGK Murray et al

    8

    Molecular Psychiatry

  • tute, supported by a joint award from the WellcomeTrust and Medical Research Council. CAMEO re-ceived pump priming funding from the StanleyMedical Research Institute and GlaxoSmithKline,and now receives support from the UK NationalHealth Service. We are grateful to staff from CAMEOand the Wolfson Brain Imaging Centre for their helpwith recruitment and data collection, and to theparticipants.

    References

    1 Crow TJ. Positive and negative schizophrenia symptoms and therole of dopamine. Br J Psychiatry 1981; 139: 251254.

    2 Laruelle M, Kegeles LS, Abi-Dargham A. Glutamate, dopamine,and schizophrenia: from pathophysiology to treatment. Ann N YAcad Sci 2003; 1003: 138158.

    3 Angrist B, Sathananthan G, Wilk S, Gershon S. Amphetaminepsychosis: behavioral and biochemical aspects. J Psychiatr Res1974; 11: 1323.

    4 Krystal JH, Perry Jr EB, Gueorguieva R, Belger A, Madonick SH,Abi-Dargham A et al. Comparative and interactive humanpsychopharmacologic effects of ketamine and amphetamine:implications for glutamatergic and dopaminergic model psychosesand cognitive function. Arch Gen Psychiatry 2005; 62: 985994.

    5 Laruelle M, Abi-Dargham A, van Dyck CH, Gil R, DSouza CD,Erdos J et al. Single photon emission computerized tomographyimaging of amphetamine-induced dopamine release in drug-freeschizophrenic subjects. Proc Natl Acad Sci USA 1996; 93:92359240.

    6 Breier A, Su TP, Saunders R, Carson RE, Kolachana BS, deBartolomeis A et al. Schizophrenia is associated with elevatedamphetamine-induced synaptic dopamine concentrations: evi-dence from a novel positron emission tomography method. ProcNatl Acad Sci USA 1997; 94: 25692574.

    7 Kapur S, Remington G. Dopamine D(2) receptors and their role inatypical antipsychotic action: still necessary and may even besufficient. Biol Psychiatry 2001; 50: 873883.

    8 Gray JA. Integrating schizophrenia. Schizophr Bull 1998; 24:249266.

    9 Crow TJ. Catecholamine reward pathways and schizophrenia: themechanism of the antipsychotic effect and the site of the primarydisturbance. Fed Proc 1979; 38: 24622467.

    10 Miller R. Schizophrenic psychology, associative learning and therole of forebrain dopamine. Med Hypotheses 1976; 2: 203211.

    11 Gray JA, Feldon J, Rawlins JNP, Smith AD. The neuropsychologyof schizophrenia. Behav Brain Sci 1991; 14: 119.

    12 Schultz W, Dayan P, Montague PR. A neural substrate of predictionand reward. Science 1997; 275: 15931599.

    13 Schultz W, Dickinson A. Neuronal coding of prediction errors.Annu Rev Neurosci 2000; 23: 473500.

    14 Crow TJ. Catecholamine-containing neurones and electrical self-stimulation. 2. A theoretical interpretation and some psychiatricimplications. Psychol Med 1973; 3: 6673.

    15 Berridge KC. The debate over dopamines role in reward: the casefor incentive salience. Psychopharmacology (Berl) 2007; 191:391431.

    16 Berridge KC, Robinson TE. What is the role of dopamine inreward: hedonic impact, reward learning, or incentive salience?Brain Res Brain Res Rev 1998; 28: 309369.

    17 Robbins TW, Everitt BJ. A role for mesencephalic dopamine inactivation: commentary on Berridge (2006). Psychopharmacology(Berl) 2006.

    18 Bleuler E. Dementia Praecox or the Group of Schizophrenias.International University Press: New York, 1911/1950.

    19 Kapur S. Psychosis as a state of aberrant salience: a frameworklinking biology, phenomenology, and pharmacology in schizo-phrenia. Am J Psychiatry 2003; 160: 1323.

    20 Beninger RJ. The slow therapeutic action of antipsychotic drugs. Apossible mechanism involving the role of dopamine in incentivelearning. In: Simon P, Soubrie P, Widlocher D (eds). Selected

    Models of Anxiety, Depression and Psychosis. Basel: Karger, 1988,pp 3651.

    21 ODoherty J, Dayan P, Schultz J, Deichmann R, Friston K, Dolan RJ.Dissociable roles of ventral and dorsal striatum in instrumentalconditioning. Science 2004; 304: 452454.

    22 Sutton RS, Barto AG. Reinforcement Learning. MIT Press:Cambridge, MA, 1998.

    23 Pessiglione M, Seymour B, Flandin G, Dolan RJ, Frith CD.Dopamine-dependent prediction errors underpin reward-seekingbehaviour in humans. Nature 2006; 442: 10421045.

    24 Nelson HE. The National Adult Reading Test (NART). NFER-Nelson: Windsor, 1982.

    25 Woods SW. Chlorpromazine equivalent doses for the neweratypical antipsychotics. J Clin Psychiatry 2003; 64: 663667.

    26 Ventura J, Green MF, Shaner A, Lieberman RP. Trainingand quality assurance with the Brief Psychiatric Rating Scale:The Drift Buster. Int J Methods Psychiatric Res 1993; 3:221224.

    27 Cools R, Blackwell A, Clark L, Menzies L, Cox S, Robbins TW.Tryptophan depletion disrupts the motivational guidance of goal-directed behavior as a function of trait impulsivity. Neuropsycho-pharmacology 2005; 30: 13621373.

    28 Henson R. Analysis of fMRI timeseries: Linear Time-Invariantmodels, event-related fMRI and optimal experimental design. In:Frackowiak R, Friston K, Frith C, Dolan R, Price C (eds). HumanBrain Function. Elsevier: London, 2004, pp 793822.

    29 Genovese CR, Lazar NA, Nichols T. Thresholding of statisticalmaps in functional neuroimaging using the false discovery rate.Neuroimage 2002; 15: 870878.

    30 Aron AR, Shohamy D, Clark J, Myers C, Gluck MA, Poldrack RA.Human midbrain sensitivity to cognitive feedback and uncertaintyduring classification learning. J Neurophysiol 2004; 92:11441152.

    31 Rorden C, Brett M. Stereotaxic display of brain lesions. BehavNeurol 2000; 12: 191200.

    32 Martinez D, Slifstein M, Broft A, Mawlawi O, Hwang DR, Huang Yet al. Imaging human mesolimbic dopamine transmission withpositron emission tomography. Part II: amphetamine-induceddopamine release in the functional subdivisions of the striatum.J Cereb Blood Flow Metab 2003; 23: 285300.

    33 Fine C, Gardner M, Craigie J, Gold I. Hopping, skipping or jumpingto conclusions? Clarifying the role of the JTC bias in delusions.Cogn Neuropsychiatry 2007; 12: 4677.

    34 Breiter HC, Aharon I, Kahneman D, Dale A, Shizgal P. Functionalimaging of neural responses to expectancy and experience ofmonetary gains and losses. Neuron 2001; 30: 619639.

    35 ODoherty JP, Deichmann R, Critchley HD, Dolan RJ. Neuralresponses during anticipation of a primary taste reward. Neuron2002; 33: 815826.

    36 Knutson B, Bjork JM, Fong GW, Hommer D, Mattay VS, WeinbergerDR. Amphetamine modulates human incentive processing.Neuron 2004; 43: 261269.

    37 Breiter HC, Gollub RL, Weisskoff RM, Kennedy DN, Makris N,Berke JD et al. Acute effects of cocaine on human brain activityand emotion. Neuron 1997; 19: 591611.

    38 Robbins TW, Everitt BJ. Neurobehavioural mechanisms of rewardand motivation. Curr Opin Neurobiol 1996; 6: 228236.

    39 Juckel G, Schlagenhauf F, Koslowski M, Filonov D, Wustenberg T,Villringer A et al. Dysfunction of ventral striatal reward predictionin schizophrenic patients treated with typical, not atypical,neuroleptics. Psychopharmacology (Berl) 2006; 187: 222228.

    40 McKenna PJ. Pathology, phenomenology and the dopaminehypothesis of schizophrenia. Br J Psychiatry 1987; 151: 288301.

    41 Beninger RJ. The role of dopamine in locomotor activity andlearning. Brain Res 1983; 287: 173196.

    42 Robbins TW. Relationship between reward-enhancing and stereo-typical effects of psychomotor stimulant drugs. Nature 1976; 264:5759.

    43 Robbins TW. Cognitive deficits in schizophrenia and parkinsonsdisease: neural basis and the role of dopamine. In: Willner P,Scheel-Kruger J (eds). The Mesolimbic Dopamine System: FromMotivation to Action. Wiley: Chichester, UK, 1991, pp 497528.

    44 Juckel G, Schlagenhauf F, Koslowski M, Wustenberg T,Villringer A, Knutson B et al. Dysfunction of ventral striatal

    Midbrain reward prediction error disruption in psychosisGK Murray et al

    9

    Molecular Psychiatry

  • reward prediction in schizophrenia. Neuroimage 2006; 29:409416.

    45 Jones SH, Gray JA, Hemsley DR. Loss of the Kamin blocking effectin acute but not chronic schizophrenics. Biol Psychiatry 1992; 32:739755.

    46 Corlett PR, Honey GD, Aitken MR, Dickinson A, Shanks DR,Absalom AR et al. Frontal responses during learning predictvulnerability to the psychotogenic effects of ketamine: linkingcognition, brain activity, and psychosis. Arch Gen Psychiatry2006; 63: 611621.

    Supplementary Information accompanies the paper on the Molecular Psychiatry website (http://www.nature.com/mp)

    Midbrain reward prediction error disruption in psychosisGK Murray et al

    10

    Molecular Psychiatry

    Substantia nigrasolventral tegmental reward prediction error disruption in psychosisIntroductionMaterials and methodsSubjectsReward learning taskBehavioural analysisRating scalesComputational modelfMRI Data Acquisition and analysis

    ResultsBehavioural resultsImaging resultsEntire sampleControl subjectsPsychosis patientsCase-control comparisonCase-control comparison with patients on antipsychotic medication excludedCase-control comparison with patients on antipsychotic medication only

    DiscussionFigure 1 Experimental task.Figure 2 Behavioural results.Figure 3 fMRI results for analysis in entire sample.Figure 4 Group differences.Table 1 Prediction error on reward trials contrasted with prediction error on neutral trials in entire sampleTable 2 Control group resultsTable 3 Group differencesAcknowledgmentsReferences