10 June 2013. Rewards can be powerful teachers, but it’s actually the surprising moments—when an anticipated reward doesn’t come or arrives unexpectedly—that drive the brain to learn. Dopamine is thought to signal discrepancies between the expected and what actually happens, called prediction errors, and a study published online May 26 in Nature Neuroscience bolsters this idea in transgenic rats using optogenetics. Led by Patricia Janak at the University of California, San Francisco, the study finds that optogenetically stimulating dopamine neurons at certain times could alter learning of an association between a cue and a reward. The results highlight a role for precisely timed dopamine release in prediction errors, which have been proposed to fuel some of schizophrenia’s symptoms.
Anomalies in the brain’s reward system turn up in schizophrenia, including signs of disrupted prediction error coding (Murray et al., 2008). The overactive dopamine system characteristic of schizophrenia (see SRF Hypothesis) could muck with the precise timing needed to correctly mark prediction errors. The “aberrant salience” hypothesis proposes that wayward dopamine signals mistakenly flag innocuous events as important, providing grist for delusions and hallucinations (Kapur, 2003).
Though the idea that dopamine signals encode prediction errors has been around for some time, the evidence has been correlative, based on recordings of dopamine neurons while animals learn to associate a stimulus with a reward (Schultz, 1998). For example, when an animal learns to associate a sound with a reward, dopamine neurons fire in the first trials because the reward arrives unexpectedly; conversely, if a reward doesn’t come when the animal expects one, dopamine neurons pause their firing. When a reward arrives as expected, dopamine neurons don’t seem to notice, maintaining their background activity.
The new study goes beyond these correlations by directly manipulating dopamine neuron firing with the precise timing afforded by optogenetics, to make the case that their activity encodes prediction errors. The researchers deliberately set up learning situations in which an optogenetically introduced dopamine signal changes a naturally occurring prediction error.
First authors Elizabeth Steinberg and Ronald Keiflin studied transgenic rats expressing Cre recombinase only in dopamine-making neurons. A Cre-dependent viral vector was used to deliver the light-sensitive channelrhodopsin-2 (ChR2) to the ventral tegmental area (VTA), home to many of the brain’s dopamine-containing neurons. An optical fiber was then positioned within the VTA on one side of the brain, allowing unilateral stimulation of dopamine neurons while the rats learned.
The researchers explored the influence of dopamine signals in two kinds of reward learning: blocking and extinction. In the blocking paradigm, a cue that is already associated with a reward can block later learning of a new association between a second cue delivered at the same time as the first cue and the reward. The failure to learn that the second, redundant cue promises a reward may be due to a lack of a prediction error signal—because of the previously learned association with the first cue. If so, introducing a dopamine signal during the reward should unblock learning about the second cue.
Rats were initially trained to obtain a drink of sugar when they heard a sound. Then, the sound was repeatedly presented with a light before reward delivery. In some rats, reward was paired with a one-second-long train of phasic stimulation to the VTA, whereas in others the VTA stimulation occurred outside of the time of reward delivery. Afterwards, when tested with the light alone, control rats which either did not get dopamine neuron stimulation (because they did not carry Cre), or which received VTA stimulation after reward delivery, spent little time at the sugar water port, indicating the block on learning. But rats receiving VTA stimulation paired with reward during the sound-plus-light trials did learn that light also predicted reward, as reflected in the greater amount of time they spent at the sugar water port when the light appeared on its own.
Staving off extinction
The researchers also tried an extinction paradigm, in which a learned association between a cue and a reward is undone by withholding the reward. Dopamine neurons pause their firing when an expected reward doesn’t come, but firing resumes as an animal learns not to expect a reward and stops its reward-seeking response to the cue. But what if a dopamine signal filled in this pause? If the signal encodes a prediction error, this would indicate that the outcome—the lack of a reward—was expected or better than expected, and preserve reward-seeking behavior.
Indeed, the researchers found that VTA stimulation during extinction trials could stall the typical reduction in reward seeking. Rats initially learned to associate a sound with a sugar water reward. Then, the next trials consisted of the sound preceding delivery of only water (a downgrade from sugar), which was paired with VTA stimulation. This VTA stimulation during water reward slowed the extinction process—when hearing the cue sound, the VTA-stimulated rats continued to seek their reward, going to the reward spigot more quickly than control rats did and spending more time around the reward spigot once they were there. This also happened when there was no reward at all—not even water. The rats did eventually tone down their reward seeking, which suggests that the unilateral VTA stimulation did not fully counteract the naturally arising prediction error.
Further experiments ruled out the possibilities that dopamine neuron stimulation worked by enhancing the value of a reward or inducing a conditioned place preference for the reward spigot. Though other dopaminergic circuitry within the VTA could still contribute to the representation of reward value (see SRF conference story)—something found to be amiss in schizophrenia (see SRF related news story)—the new findings argue that at least some VTA neurons are integral to prediction error. Although the malleability of these signals makes them challenging to work with, they may be linchpins of schizophrenia's diverse symptoms.—Michele Solis.
Steinberg EE, Keiflin R, Boivin JR, Witten IB, Deisseroth K, Janak PH. A causal link between prediction errors, dopamine neurons and learning. Nat Neurosci. 2013 May 26. Abstract