Surprise Signals: Optogenetics Links Dopamine to Prediction Errors
10 June 2013. Rewards can be powerful teachers, but it’s actually the surprising moments—when an anticipated reward doesn’t come or arrives unexpectedly—that drive the brain to learn. Dopamine is thought to signal discrepancies between the expected and what actually happens, called prediction errors, and a study published online May 26 in Nature Neuroscience bolsters this idea in transgenic rats using optogenetics. Led by Patricia Janak at the University of California, San Francisco, the study finds that optogenetically stimulating dopamine neurons at certain times could alter learning of an association between a cue and a reward. The results highlight a role for precisely timed dopamine release in prediction errors, which have been proposed to fuel some of schizophrenia’s symptoms.
Anomalies in the brain’s reward system turn up in schizophrenia, including signs of disrupted prediction error coding (Murray et al., 2008). The overactive dopamine system characteristic of schizophrenia (see SRF Hypothesis) could muck with the precise timing needed to correctly mark prediction errors. The “aberrant salience” hypothesis proposes that wayward dopamine signals mistakenly flag innocuous events as important, providing grist for delusions and hallucinations (Kapur, 2003).
Though the idea that dopamine signals encode prediction errors has been around for some time, the evidence has been correlative, based on recordings of dopamine neurons while animals learn to associate a stimulus with a reward (Schultz, 1998). For example, when an animal learns to associate a sound with a reward, dopamine neurons fire in the first trials because the reward arrives unexpectedly; conversely, if a reward doesn’t come when the animal expects one, dopamine neurons pause their firing. When a reward arrives as expected, dopamine neurons don’t seem to notice, maintaining their background activity.
The new study goes beyond these correlations by directly manipulating dopamine neuron firing with the precise timing afforded by optogenetics, to make the case that their activity encodes prediction errors. The researchers deliberately set up learning situations in which an optogenetically introduced dopamine signal changes a naturally occurring prediction error.
First authors Elizabeth Steinberg and Ronald Keiflin studied transgenic rats expressing Cre recombinase only in dopamine-making neurons. A Cre-dependent viral vector was used to deliver the light-sensitive channelrhodopsin-2 (ChR2) to the ventral tegmental area (VTA), home to many of the brain’s dopamine-containing neurons. An optical fiber was then positioned within the VTA on one side of the brain, allowing unilateral stimulation of dopamine neurons while the rats learned.
The researchers explored the influence of dopamine signals in two kinds of reward learning: blocking and extinction. In the blocking paradigm, a cue that is already associated with a reward can block later learning of a new association between a second cue delivered at the same time as the first cue and the reward. The failure to learn that the second, redundant cue promises a reward may be due to a lack of a prediction error signal—because of the previously learned association with the first cue. If so, introducing a dopamine signal during the reward should unblock learning about the second cue.
Rats were initially trained to obtain a drink of sugar when they heard a sound. Then, the sound was repeatedly presented with a light before reward delivery. In some rats, reward was paired with a one-second-long train of phasic stimulation to the VTA, whereas in others the VTA stimulation occurred outside of the time of reward delivery. Afterwards, when tested with the light alone, control rats which either did not get dopamine neuron stimulation (because they did not carry Cre), or which received VTA stimulation after reward delivery, spent little time at the sugar water port, indicating the block on learning. But rats receiving VTA stimulation paired with reward during the sound-plus-light trials did learn that light also predicted reward, as reflected in the greater amount of time they spent at the sugar water port when the light appeared on its own.
Staving off extinction
The researchers also tried an extinction paradigm, in which a learned association between a cue and a reward is undone by withholding the reward. Dopamine neurons pause their firing when an expected reward doesn’t come, but firing resumes as an animal learns not to expect a reward and stops its reward-seeking response to the cue. But what if a dopamine signal filled in this pause? If the signal encodes a prediction error, this would indicate that the outcome—the lack of a reward—was expected or better than expected, and preserve reward-seeking behavior.
Indeed, the researchers found that VTA stimulation during extinction trials could stall the typical reduction in reward seeking. Rats initially learned to associate a sound with a sugar water reward. Then, the next trials consisted of the sound preceding delivery of only water (a downgrade from sugar), which was paired with VTA stimulation. This VTA stimulation during water reward slowed the extinction process—when hearing the cue sound, the VTA-stimulated rats continued to seek their reward, going to the reward spigot more quickly than control rats did and spending more time around the reward spigot once they were there. This also happened when there was no reward at all—not even water. The rats did eventually tone down their reward seeking, which suggests that the unilateral VTA stimulation did not fully counteract the naturally arising prediction error.
Further experiments ruled out the possibilities that dopamine neuron stimulation worked by enhancing the value of a reward or inducing a conditioned place preference for the reward spigot. Though other dopaminergic circuitry within the VTA could still contribute to the representation of reward value (see SRF conference story)—something found to be amiss in schizophrenia (see SRF related news story)—the new findings argue that at least some VTA neurons are integral to prediction error. Although the malleability of these signals makes them challenging to work with, they may be linchpins of schizophrenia's diverse symptoms.—Michele Solis.
Steinberg EE, Keiflin R, Boivin JR, Witten IB, Deisseroth K, Janak PH. A causal link between prediction errors, dopamine neurons and learning. Nat Neurosci. 2013 May 26. Abstract
Comments on News and Primary Papers
Primary Papers: A causal link between prediction errors, dopamine neurons and learning.Comment by: Phil Corlett
Submitted 8 July 2013
Posted 8 July 2013
Researchers across our field (even those relatively less interested in the brain) are deeply concerned with causality—from those geneticists or epidemiologists assessing the relationships between genes or cannabis exposure and illness onset to those phenomenologists concerned with how patients describe their thoughts and actions as lacking causal agency. For the most part, all of our observations are correlational. Anything more causal, with a few exceptions (Corlett et al., 2009), would entail ethical concerns. Causality is particularly problematic for those of us concerned with the neuronal mechanisms of symptom generation. Are the neural signals we observe with functional neuroimaging of patients with psychotic symptoms, for example, a cause of those symptoms or a consequence of having distressing and distracting experiences in the scanner?
In a 1979 issue of Scientific American, Francis Crick (of DNA fame) wished for a method to gain control over some neurons whilst "leaving the others more or less unaltered" (Crick, 1979). If we can recreate a pattern of firing that is thought to be necessary and sufficient for a particular cognitive process, we can be more certain that that particular signal is causing that process.
One increasingly influential model of delusion formation—the aberrant prediction error account (Corlett et al., 2007; Gray et al., 1991; Hemsley, 1994)—recently received a preclinical boost in this regard. Using optogenetic techniques (see below), Steinberg, Janak, and colleagues demonstrated that a specific pattern of neural firing, prediction error, at a particular time (coincident with irrelevant cues), was causally related to behavioral learning (about those irrelevant, predictively redundant cues) (Steinberg et al., 2013). This is exactly the pattern of neural signaling and behavior that portends delusions, according to the theory (Corlett et al., 2007; Gray et al., 1991; Hemsley, 1994). Prediction errors and delusions have been associated previously with correlative studies (Corlett et al., 2007; Romaniuk et al., 2010; Schlagenhauf et al., 2009). These new data show that, if one engenders prediction errors artificially, aberrant learning results.
Steinberg, Janak, and their colleagues achieved this feat by first identifying a cell population of interest—dopamine cells in the midbrain previously implicated in the pathophysiology of schizophrenia (Carlsson and Carlsson, 1990). Next, they chose a firing pattern—phasic firing—which has been correlated with Pavlovian and instrumental learning in now classical, albeit correlative, work from Wolfram Schultz (Schultz and Dickinson, 2000; Tobler et al., 2006; Waelti et al., 2001). Finally, they homed in on an informative behavioral paradigm—blocking.
Blocking calls into question Hebb’s adage that in learning, things that fire together wire together (Hebb, 1949). We do, of course, learn from contiguity and statistical regularity, but blocking teaches us that there is more to learning than mere correlation (McLaren and Dickinson, 1990). In a blocking paradigm, we first train a cue (such as a light) as a predictor of a salient outcome (such as a food reward). Next, we add a novel cue (such as a tone) to the first cue. This compound of cues (tone and light together) also predicts the outcome. Prior learning about the first cue blocks learning about the second cue (Kamin, 1969). That is, despite being contiguous with the outcome, an association between the second, blocked cue is not learned. This is because the outcome was already predicted.
Blocking led to the invention of learning theories that focus on prediction error—the mismatch between what we expect (based on what we have already learned) and what we experience (Rescorla and Wagner, 1972). We learn most when prediction errors are largest, and we don’t learn when there is no prediction error, as in the blocking case. Prediction errors have been implicated in the formation of causal beliefs—associations between cause and effect (Dickinson, 2001; Fletcher and Henson, 2001)—in social inferences, such as attributions of worker productivity (Cramer et al., 2002) and in the formation of trusting relationships (Behrens et al., 2008). Furthermore, aberrant prediction errors—signaled independent of cue and context—have been associated with delusion formation in psychotic illnesses (Corlett et al., 2007; Romaniuk et al., 2010; Schlagenhauf et al., 2009) and model psychoses such as amphetamine and ketamine administration in human subjects (Bernacer et al., 2013; Corlett et al., 2006).
However, all of these important observations lack a definitive demonstration of causal association between prediction error, learning, belief, reputation, trust, or delusion. Janak, Steinberg, and their colleagues provide some of the first evidence for a causal association among prediction error, dopamine firing, and learning.
They made an elegant genetic manipulation in rats. First, they targeted dopamine neurons specifically by driving Cre recombinase expression with a tyrosine hydroxylase promoter. Tyrosine hydroxylase is an enzyme that is crucial in the production of dopamine and is present in dopamine cells. Cre recombinase subsequently triggered the expression of an ion channel that is sensitive to light—when this channel is stimulated with light of a particular wavelength, the channel opens and the cells expressing the channel depolarize and fire action potentials. The genetic technique colocalizes the channel and the dopamine. The targeted application of light (hence, optogenetic) ensures specificity to cells that have previously shown prediction error-like signaling. Steinberg et al. applied the light following blocking trials (when there should be no prediction error—confirmed with neural recordings in monkeys and with fMRI in humans; see Tobler et al., 2006; Waelti et al., 2001). The causal proof comes if the rats learned about the blocked cue. They did. They formed a predictive association between the blocked cue and the reward.
Intriguingly, this is exactly what we think happens in patients with psychotic illness. Their prediction error signaling neurons are inappropriately activated, driving them to attend to and learn about things they ought not to, things that non-psychotic people would ignore. This is the kernel for delusion formation. Paul Fletcher and I recently showed that in non-psychotic individuals with odd beliefs (such as telekinesis or alien abduction), there are prediction error brain signals during blocking, and the magnitude of these signals correlates with weaker blocking and with the severity of odd beliefs (Corlett and Fletcher, 2012).
Whilst it is heartening to have our theories supported by preclinical data, what might be the clinical application of this new result? We cannot make optogenetic manipulations in human subjects. However, we can make manipulations of cortical neural activity using transcranial magnetic stimulation. Such studies are underway in my lab at Yale. We are using stimulation protocols that engender neural inhibition and long-term depression (Hoffman and Cavus, 2002) in an attempt to cancel aberrant prediction errors in psychotic patients. The data of Steinberg et al. suggest that this approach might ultimately help us to control prediction error signaling and curtail psychotic symptoms.
Corlett PR, Frith CD, Fletcher PC. From drugs to deprivation: a Bayesian framework for understanding models of psychosis. Psychopharmacology (Berl) . 2009 Nov ; 206(4):515-30. Abstract
Crick FH. Thinking about the brain. Sci Am . 1979 Sep ; 241(3):219-32. Abstract
Corlett PR, Honey GD, Fletcher PC. From prediction error to psychosis: ketamine as a pharmacological model of delusions. J Psychopharmacol . 2007 May ; 21(3):238-52. Abstract
Gray, J. A., Feldon, J., Rawlins, J.N.P., Hemsley, D., Smith, A.D. The Neuropsychology of Schizophrenia. Behav. Brain Sci. 14, 1-84 (1991).
Hemsley, D. R. in The Neuropsychology of Schizophrenia (ed A. S. David, Cutting, J.C.) 97-118 (Laurence Erlbaum Associates, 1994).
Steinberg EE, Janak PH. Establishing causality for dopamine in neural function and behavior with optogenetics. Brain Res . 2013 May 20 ; 1511():46-64. Abstract
Corlett PR, Murray GK, Honey GD, Aitken MR, Shanks DR, Robbins TW, Bullmore ET, Dickinson A, Fletcher PC. Disrupted prediction-error signal in psychosis: evidence for an associative account of delusions. Brain . 2007 Sep ; 130(Pt 9):2387-400. Abstract
Romaniuk L, Honey GD, King JR, Whalley HC, McIntosh AM, Levita L, Hughes M, Johnstone EC, Day M, Lawrie SM, Hall J. Midbrain activation during Pavlovian conditioning and delusional symptoms in schizophrenia. Arch Gen Psychiatry . 2010 Dec ; 67(12):1246-54. Abstract
Schlagenhauf F, Sterzer P, Schmack K, Ballmaier M, Rapp M, Wrase J, Juckel G, Gallinat J, Heinz A. Reward feedback alterations in unmedicated schizophrenia patients: relevance for delusions. Biol Psychiatry . 2009 Jun 15 ; 65(12):1032-9. Abstract
Carlsson M, Carlsson A. Schizophrenia: a subcortical neurotransmitter imbalance syndrome? Schizophr Bull . 1990 ; 16(3):425-32. Abstract
Schultz W, Dickinson A. Neuronal coding of prediction errors. Annu Rev Neurosci . 2000 ; 23():473-500. Abstract
Tobler PN, O'Doherty JP, Dolan RJ, Schultz W. Human neural learning depends on reward prediction errors in the blocking paradigm. J Neurophysiol . 2006 Jan ; 95(1):301-10. Abstract
Waelti P, Dickinson A, Schultz W. Dopamine responses comply with basic assumptions of formal learning theory. Nature . 2001 Jul 5 ; 412(6842):43-8. Abstract
Hebb, D. O. The Organization of Behavior. (John Wiley, 1949).
McLaren IP, Dickinson A. The conditioning connection. Philos Trans R Soc Lond B Biol Sci . 1990 Aug 29 ; 329(1253):179-86. Abstract
Kamin, L. in Punishment and Aversive Behavior (ed B.A. Campbell, Church, R.M.) (Appleton-Century-Crofts, 1969).
Rescorla, R. A., Wagner, A.R. in Classical conditioning II: Current research and theory (ed A.H. Black, Prokasy, W.F.) (Appleton-Century-Crofts, 1972).
Dickinson A. The 28th Bartlett Memorial Lecture. Causal learning: an associative analysis. Q J Exp Psychol B . 2001 Feb ; 54(1):3-25. Abstract
Fletcher PC, Henson RN. Frontal lobes and human memory: insights from functional neuroimaging. Brain . 2001 May ; 124(Pt 5):849-81. Abstract
Cramer RE, Weiss RF, William R, Reid S, Nieri L, Manning-Ryan B. Human agency and associative learning: Pavlovian principles govern social process in causal relationship detection. Q J Exp Psychol B . 2002 Jul ; 55(3):241-66. Abstract
Behrens TE, Hunt LT, Woolrich MW, Rushworth MF. Associative learning of social value. Nature . 2008 Nov 13 ; 456(7219):245-9. Abstract
Bernacer J, Corlett PR, Ramachandra P, McFarlane B, Turner DC, Clark L, Robbins TW, Fletcher PC, Murray GK. Methamphetamine-Induced Disruption of Frontostriatal Reward Learning Signals: Relation to Psychotic Symptoms. Am J Psychiatry . 2013 Jun 4. Abstract
Corlett PR, Honey GD, Aitken MR, Dickinson A, Shanks DR, Absalom AR, Lee M, Pomarol-Clotet E, Murray GK, McKenna PJ, Robbins TW, Bullmore ET, Fletcher PC. Frontal responses during learning predict vulnerability to the psychotogenic effects of ketamine: linking cognition, brain activity, and psychosis. Arch Gen Psychiatry . 2006 Jun ; 63(6):611-21. Abstract
Corlett PR, Fletcher PC. The neurobiology of schizotypy: fronto-striatal prediction error signal correlates with delusion-like beliefs in healthy people. Neuropsychologia . 2012 Dec ; 50(14):3612-20. Abstract
Hoffman RE, Cavus I. Slow transcranial magnetic stimulation, long-term depotentiation, and brain hyperexcitability disorders. Am J Psychiatry . 2002 Jul ; 159(7):1093-102. Abstract
View all comments by Phil Corlett
Primary Papers: A causal link between prediction errors, dopamine neurons and learning.
Comment by: Anna Ermakova
Submitted 22 July 2013
Posted 22 July 2013
Highly replicated correlational studies, beginning with electrophysiological recordings in primates and rodents and followed up with similar studies using PET and MRI in humans, established a strong correlation between dopamine neuronal firing and learning about rewards. This process appeared driven by the mismatch between expected and actual outcome, called prediction error. Steinberg et.al. in their recent article take a crucial next step into the interactions among prediction errors, dopamine, and reinforcement learning: They demonstrate a causal link between phasic dopamine prediction error signaling in the midbrain and learning stimulus-reward associations. In their elegant experiments they used two classical learning paradigms: associative blocking and extinction. They mimicked prediction error signaling by inducing precisely timed dopamine firing with optogenetics to slow down extinction and to drive learning. This is an important first step for moving away from correlational studies to direct manipulation, and I am sure it will be followed by many others to advance understanding of the precise neural mechanisms of reinforcement learning in health and disease.
View all comments by Anna Ermakova
Comments on Related News
Related News: Deconstructing Negative Symptoms in SchizophreniaComment by: Laurie Kimmel
Submitted 25 October 2012
Posted 26 October 2012
As a clinician, I find this research encouraging.
View all comments by Laurie Kimmel