Behavioral Training of Reward Learning Increases Reinforcement Learning Parameters and Decreases Depression Symptoms Across Repeated Sessions

TR Number
Journal Title
Journal ISSN
Volume Title
Virginia Tech

Background: Disrupted reward learning has been suggested to contribute to the etiology and maintenance of depression. If deficits in reward learning are core to depression, we would expect that improving reward learning would decrease depression symptoms across time. Whereas previous studies have shown that changing reward learning can be done in a single study session, effecting clinically meaningful change in learning requires this change to endure beyond task completion and transfer to real world environments. With a longitudinal design, we investigate the potential for repeated sessions of behavioral training to create change in reward learning and decrease depression symptoms across time. Methods: 929 online participants (497 depression-present; 432 depression-absent) recruited from Amazon’s Mechanical Turk platform completed a behavioral training paradigm and clinical selfreport measures for up to eight total study visits. Participants were randomly assigned to one of 12 arms of the behavioral training paradigm, in which they completed a probabilistic reward learning task interspersed with queries about a feature of the task environment (11 learning arms) or a control query (1 control arm). Learning queries trained participants on one of four computational-based learning targets known to affect reinforcement learning (probability, average or extreme outcome values, and value comparison processes). A reinforcement learning model previously shown to distinguish depression related differences in learning was fit to behavioral responses using hierarchical Bayesian estimation to provide estimates of reward sensitivity and learning rate for each participant on each visit. Reward sensitivity captured participants’ value dissociation between high versus low outcome values, while learning rate informed how much participants learned from previously experienced outcomes. Mixed linear models assessed relationships between model-agnostic task performance, computational model-derived reinforcement learning parameters, depression symptoms, and study progression. Results: Across time, learning queries increased individuals’ reward sensitivities in depression-absent participants (β = 0.036, p =< 0.001, 95% CI (0.022, 0.049)). In contrast, control queries did not change reward sensitivities in depression-absent participants across time ((β = 0.016, p = 0.303, 95% CI (-0.015, 0.048)). Learning rates were not affected across time for participants receiving learning queries (β = 0.001, p = 0.418, 95% CI (-0.002, 0.004)) or control queries (β = 0.002, p = 0.558, 95% CI (-0.005, 0.009). Of the learning queries, those targeting value comparison processes improved depression symptoms (β = -0.509, p = 0.015, 95% CI (-0.912, - 0.106)) and increased reward sensitivities across time (β = 0.052, p =< 0.001, 95% CI (0.030, 0.075)) in depression-present participants. Increased reward sensitivities related to decreased depression symptoms across time in these participants (β = -2.905, p = 0.002, 95% CI (-4.75, - 1.114)). Conclusions: Multiple sessions of targeted behavioral training improved reward learning for participants with a range of depression symptoms. Improved behavioral reward learning was associated with improved clinical symptoms with time, possibly because learning transferred to real world scenarios. These results support disrupted reward learning as a mechanism contributing to the etiology and maintenance of depression and suggest the potential of repeated behavioral training to target deficits in reward learning.

Reinforcement learning, Depression, Computational psychiatry, Reward learning