Metaplasticity Can Adjust Learning According To Reward Uncertainty
The brain can self-adjust its learning rates using a synaptic mechanism called metaplasticity, a Dartmouth-led study reveals. Each time we get feedback, the brain is hard at work updating its knowledge and behavior in response to changes in the environment; yet, if there’s uncertainty or volatility in the environment, the entire process must be adjusted.
These findings controvert the theory that the brain always behaves optimally. How the brain adjusts learning, has long been thought to be driven by the brain’s reward system and its goal of optimizing rewards obtained from the environment or by a more cognitive system responsible for learning the structure of the environment.
Synapses are the connections between neurons in the brain and are responsible for transferring information from one neuron to the next. When it comes to choice in evaluating potential rewards, your learned value of a particular option, reflecting how much you like something, is stored in certain synapses.
If you get positive feedback after choosing a particular option, the brain increases the value of that option by making the associated synapses stronger. In contrast, if the feedback is negative, those synapses become weaker.
Synapses, however, can also undergo modifications without changing how they transmit information through a process called metaplasticity.
Previous studies have suggested that the brain relies on a dedicated system for monitoring the uncertainty in the environment to adjust its rate of learning. The authors of this study found however, that metaplasticity alone is sufficient to fine-tune learning according to the uncertainty about reward in a given environment.
Alireza Soltani, assistant professor of psychological and brain sciences at Dartmouth, says:
“One of the most complex problems in learning is how to adjust to uncertainty and the rapid changes that take place in the environment. It is very exciting to find that synapses, the simplest computational elements in the brain, can provide a robust solution for such challenges. Of course, such simple elements may not provide an optimal solution but we found that a model based on metaplasticity can explain real behaviors better than models that are based on optimality.”
To understand the neural mechanisms for adjusting learning, more specifically, how learning and choice are impacted by reward uncertainty and volatility in an environment, the researchers created a model based on metaplasticity. They tested this model against a behavioral dataset from a recent Yale University study of non-human primates in which the probabilities of obtaining a reward were switched to create environments with different levels of volatility.
Learning Rate Tradeoffs
When things change frequently, a large learning rate is required but this reduces precision, whereas, a stable environment requires a small learning rate, which improves precision. The study illustrates how metaplasticity can mitigate the tradeoff between adaptability and precision in learning.
The metaplasticity model also demonstrates how the learning rate might be different for each choice or option. If a particular choice continues to give reward for a while, the learning rate on that option becomes larger for rewarding outcomes and smaller for non-rewarding outcomes.
In other words, if the environment does not change, the synapses needed for changing the preferences become less sensitive to feedback in the opposite direction. In addition, the model also predicts that different options or actions could maintain their own learning rates.
This study shows that learning can be self-adjusted and does not require explicit optimization or complete knowledge of the environment. The authors propose potential practical implications of their findings.
The brain’s inability to modify its behavior may be attributed to the slowing down of plasticity due to metaplasticity, which can occur in a highly stable environment. For behavioral anomalies such as addiction, where the synapses might not adapt flexibly, more carefully designed feedback may be required to make the system plastic again, illustrating how metaplasticity may have broader relevance.