Neural encoding of rapidly adapting risk preferences

Poster No:

892 

Submission Type:

Abstract Submission 

Authors:

Simon Steinkamp1, David Meder1, Oliver Hulme1,2,3

Institutions:

1Danish Research Centre for Magnetic Resonance, Copenhagen University Hospital - Amager and Hvidovre, Copenhagen, Denmark, 2London Mathematical Laboratory, London, United Kingdom, 3Department of Psychology, University of Copenhagen, Copenhagen, Denmark

First Author:

Simon Steinkamp  
Danish Research Centre for Magnetic Resonance, Copenhagen University Hospital - Amager and Hvidovre
Copenhagen, Denmark

Co-Author(s):

David Meder  
Danish Research Centre for Magnetic Resonance, Copenhagen University Hospital - Amager and Hvidovre
Copenhagen, Denmark
Oliver Hulme  
Danish Research Centre for Magnetic Resonance, Copenhagen University Hospital - Amager and Hvidovre|London Mathematical Laboratory|Department of Psychology, University of Copenhagen
Copenhagen, Denmark|London, United Kingdom|Copenhagen, Denmark

Introduction:

Economic decision-making theories assume that risk preferences are trait-like, variable over people, but stable over the lifetime. A recent theory inspired by physics predicts that risk preferences should be determined by the type of environmental dynamics that people face (Peters, 2019). Meder et al. (2021) provided behavioral evidence for this theory, showing that participants were more risk-averse when deciding between multiplicative as opposed to additive gambles. In the study, participants first completed a learning task in which the value of nine images was implicitly learned from their effects on wealth. Participants then decided between gambles comprised of these images. In one session the effect of the images was additive (fixed changes), in the other multiplicative (% changes of current wealth). Here, we analyze fMRI data collected with the behavioral data in Meder et al. (2021), aiming to infer how risk preferences are encoded in the brain, and whether they reflect the changes that were observed behaviorally. We do this, by fitting reward learning models to the learning task that use either changes of the linear or log-transformed wealth trajectories (Fig. 1B) as a reward signal, thus integrating models of risk taking into reinforcement learning models.
Supporting Image: ohbm2024-fig1.png
   ·A) The learning task; B) Wealth trajectories in the additive and multiplicative learning task; C) Trial representation used for the TD model and RPE and value predicted by the model.
 

Methods:

We analyzed fMRI data from 14 participants (n=19, 5 excluded) from Meder et al. (2021), who completed the learning task. Each session had two runs of 168 trials each, resulting in up to 60 min of fMRI data per session.
To estimate the reward prediction errors (RPE) in the experiment we used two types of TD-learning models. One used linear wealth changes as the reward signal (TDlin), the other used changes in logarithmic wealth (TDlog). For each trial, the RPE was estimated at the onset of the response cue, the image, and the wealth update (Fig. 1A&C). The RPE signal was then used as a parametric modulator in first-level GLMs.
First, we tested TDlin vs TDlog within each session, by comparing the cross-validated log model evidence of the GLMs using the MACS toolbox for SPM (Soch & Allefeld, 2018). We also looked at the group effect of the parametric modulator for TDlin in the additive and TDlog in the multiplicative session.
We also tested changes in risk preferences directly, by including both sessions in first-level GLMs. GLM one had parametric modulators of TDlin for the additive session and TDlog for the multiplicative session; GLMs two and three used TDlin or TDlog for both sessions. We used the difference of the Bayesian information criterion to compare the models on the group level.

Results:

We found group-level evidence for the main effect of linear RPE under the additive condition in striatal regions, while evidence for the main effect of logarithmic RPE was revealed in VMPFC in the multiplicative condition (both p <0.001, unc., Fig. 2A).
When evaluating shifts in risk preferences (Fig. 2B&C), we found that VMPFC has strong evidence for the TDlin model over the TDlog model under additive conditions, and strong evidence for the TDlog model over the TDlin model under multiplicative conditions. This observed pattern of activity appears to correspond to the RPE encoding that one would expect given the behaviorally observed changes in risk preferences.
Supporting Image: ohbm_2024_fig2.png
   ·A) group-level results of the main effect of the RPE (parametric modulator), reported at p < 0.001 unc. B) log BF maps of the cross-validated model evidence. C) Voxel-wise group differences in BIC.
 

Conclusions:

Our preliminary results suggest that VMPFC shows an RPE signal that is sensitive to the dynamics of the environment, indicating that reward encoding changes from a linear utility function in an additive environment to a logarithmic utility function in a multiplicative one. The weak main effect of RPE, however, suggests that our TD-model may not be optimal. Considering the behavioral results of Meder et al. (2021), it is also likely, that while participants gravitated towards logarithmic or linear risk-preferences, there is both substantial individual variability, and a bias towards risk aversion, which may also impact the model fit. In later analysis, we will attempt to directly map individual risk preferences onto brain data.

Higher Cognitive Functions:

Decision Making 1

Modeling and Analysis Methods:

Activation (eg. BOLD task-fMRI) 2
Bayesian Modeling

Keywords:

Computational Neuroscience
FUNCTIONAL MRI
Learning
Other - decision-making; reinforcement learning; risk-preferences;

1|2Indicates the priority used for review

Provide references using author date format

Meder D., Rabe F., Morville T., Madsen K.H., Koudahl M.T., et al. (2021) Ergodicity-breaking reveals time optimal decision making in humans. PLOS Computational Biology 17(9): e1009217. https://doi.org/10.1371/journal.pcbi.1009217

Peters, O. (2019). The ergodicity problem in economics. Nature Physics, 15(12), 1216–1221. https://doi.org/10.1038/s41567-019-0732-0

Soch, J., & Allefeld, C. (2018). MACS – a new SPM toolbox for model assessment, comparison and selection. In Journal of Neuroscience Methods (Vol. 306, pp. 19–31). Elsevier BV. https://doi.org/10.1016/j.jneumeth.2018.05.017