Aversion to option loss in a restless bandit task
Date created: | Last Updated:
: DOI | ARK
Creating DOI. Please wait...
Description: In everyday life people need to make choices without full information about the environment, which poses an explore-exploit dilemma in which one needs to balance the need to learn about the world and the need to obtain rewards from it. The explore-exploit dilemma is often studied using the multi-armed restless bandit task, in which people repeatedly select from multiple options, and human behaviour is modelled as a form of reinforcement learning via Kalman filters. Inspired by work in the judgment and decision-making literature, we present two experiments using multi- armed bandit tasks in both static and dynamic environments, in situations where options can become unviable and vanish if they are not pursued. A Kalman filter model using Thompson sampling provides an excellent account of human learning in a standard restless bandit task, but there are systematic departures in the vanishing bandit task. We estimate the structure of this loss aversion signal and consider theoretical explanations for the results.