Sensitivity to the Instrumental Value of Choice Increases Across Development

Across development, people tend to demonstrate a preference for contexts in which they have the opportunity to make choices. However, it is not clear how children, adolescents, and adults learn to calibrate this preference based on the costs and benefits of agentic choice. Here, in both a primary, in-person, reinforcement-learning experiment ( N = 92; age range = 10–25 years) and a preregistered online replication study ( N = 150; age range = 8–25 years), we found that participants overvalued agentic choice but also calibrated their agency decisions to the reward structure of the environment, increasingly selecting agentic choice when choice had greater instrumental value. Regression analyses and computational modeling of participant choices revealed that participants’ bias toward agentic choice— reflecting its intrinsic value—remained consistent across age, whereas sensitivity to the instrumental value of agentic choice increased from childhood to early adulthood.

Two distinct and age-varying motivations may underlie people's valuation of agentic choice: its instrumental and intrinsic value.In controllable environments, in which choices influence experienced reward outcomes, agentic choice has instrumental value because making choices may yield greater reward than forgoing opportunities to choose (Katzman & Hartley, 2020;Ly et al., 2019;Moscarello & Hartley, 2017).Across real-world environments the instrumental value of choice varies, and people must learn the extent to which their choices influence the rewards they experience.They can then leverage this knowledge to determine whether the benefits of agentic choice outweigh the potential time and effort costs of choosing (Boureau et al., 2015;Shenhav et al., 2017;Westbrook & Braver, 2015).Multiple aspects of these reward-learning and choice processes may change with age; children, adolescents, and adults may vary in the extent to which they update their beliefs about how rewarding an action is following wins and losses (Habicht et al., 2022;Nussenbaum et al., 2022;G. M. Rosenbaum et al., 2022) or following outcomes that were or were not elicited by their own choices (Cockburn et al., 2014).This in turn may lead to systematic developmental differences in estimates of the value of agentic choice.In addition, the use of learned values to guide decisions about whether to seek or forgo agentic choice may require prospective simulation of potential outcomes, which itself may engage cognitive control and working memory processes that improve through adolescence (Luna et al., 2015).Thus, the influence of instrumental value on agentic choice may be supported by multiple cognitive processes that undergo marked change across development.
People may also seek opportunities to make choices because agentic choice is intrinsically rewarding, meaning choice is valued in and of itself beyond its efficacy in promoting the acquisition of reward (Blain & Sharot, 2021;Bown et al., 2003;Cockburn et al., 2014;Leotti & Delgado, 2011;Leotti et al., 2010).Even when one's choices do not affect the rewards one experiences, opportunities to choose may still be valued because they enhance beliefs that one's actions have causal influence (Ly et al., 2019).Across ages, people report greater feelings of well-being with higher levels of perceived control (Bandura et al., 2003;Véronneau et al., 2005;Weinberg et al., 1979;Weinstein & Mermelstein, 2007) and prefer environments in which they have more opportunities to choose (Bown et al., 2003;Katzman & Hartley, 2020;Leotti & Delgado, 2011;Sran & Borrero, 2010;Suzuki, 1997), even when their choices do not influence the reward outcomes they experience.
The intrinsic value of choice may also vary across development, with prior research in related areas suggesting multiple potential patterns of age-related change.Across species, sensitivity to diverse types of rewards, including extrinsic reinforcers such as money, sucrose, and drugs (Cauffman et al., 2010;Doremus et al., 2005;Galvan et al., 2006;Galván & McGlennen, 2013;Smith et al., 2012), and intrinsic, cognitive rewards such as novelty and social interaction (Douglas et al., 2003(Douglas et al., , 2004)), varies nonlinearly with age, with the greatest sensitivity often occurring in adolescence.The intrinsic reward that people experience when they make their own choices may exhibit a similar pattern of developmental change, peaking earlier in adolescence and declining through adulthood.Another possibility is that the intrinsic value of choice decreases monotonically across development.Indeed, relative to adults, children tend to seize opportunities to effect consequences in the world, showing stronger biases toward actions that have greater causal influence (Liquin & Gopnik, 2022;McCormack et al., 2016;Meng et al., 2018;Nussenbaum, Cohen, et al., 2020;Raab & Hartley, 2019).It may be that increasing autonomy from childhood to adulthood confers more opportunity for agentic choice, leading to a decline in its intrinsic value.Finally, a third competing possibility is that the intrinsic value of choice increases over developmental time: As people learn to make better choices that more consistently lead to positive outcomes across contexts, the act of choosing itself may acquire greater value.
The present study investigates the relative contributions of intrinsic and instrumental value to agentic choice across development.Across two experiments, participants aged 8 to 25 years completed a novel task that enabled us to measure their preferences for agentic choice across conditions in which the instrumental value of choice varied.The task was coupled with a computational model that characterized how agerelated differences in reward-learning mechanisms contribute to differential sensitivity to the intrinsic and instrumental value of choice from middle childhood to early adulthood.We hypothesized that participants across our age range would demonstrate some degree of agentic-choice bias, seeking opportunities to choose even when doing so had no instrumental value, but that developmental changes in value-guided learning and decision-making would lead to an increasing influence of the instrumental value of choice on decisions from childhood to adulthood.

Statement of Relevance
Across development, people must frequently decide between making choices for themselves or letting others decide for them.Two distinct motivations influence decisions about whether to seek or forgo opportunities for choice: Choices can have instrumental value, in that they can promote the acquisition of reward, and they may also have intrinsic value, in that they are experienced as being rewarding in and of themselves beyond any external benefits they may bring.Here, we demonstrate that these distinct motivations differentially influence decisions about whether to seek or forgo opportunities for choice across development.
Although the intrinsic value of choice remained consistent from middle childhood to early adulthood, the influence of instrumental value on choice decisions increased with age.Our findings suggest that people's decisions about seeking versus forgoing opportunities for choice become increasingly well calibrated to the environment's reward structure across development.

Open Practices Statement
For both experiments, all task code and stimuli, task data, and analysis code are publicly available on the Open Science Framework at https://osf.io/69rs8/.Experiment 1 was not preregistered.For Experiment 2, the hypotheses, methods, and analysis plan were preregistered, and the preregistration can be accessed on the Open Science Framework at https://osf.io/t94a7.

Method
Participants.Ninety-two participants, evenly distributed between the ages of 10 and 25 years, were recruited from New York University and the surrounding community for an in-person behavioral study.Participants were recruited through advertisements on social media, flyers around New York University, and science fairs and events throughout New York City.Adult participants and parents of minors provided informed consent; participants under 18 years of age agreed to participate.Participants were paid $20 for participation, plus a $5 performance bonus.Though we treated age continuously in all statistical analyses, we divided participants into age groups for data-visualization purposes; in total, 17 children (8 females; ages 10-12 years, M = 11.3 years, SD = 0.9 years), 29 adolescents (14 females; ages 13-17 years, M = 15.4 years, SD = 1.5 years), and 46 adults (25 females; ages 18-25 years, M = 22.1 years, SD = 2.4 years) participated in the study.Participants' self-identified race and ethnicity were as follows: 40.2% White, 28.2% Asian, 18.5% more than one race, 12.0% Black, and 1.1% Pacific Islander/Native Hawaiian.Fifteen percent of participants identified as Hispanic.Research procedures were approved by New York University's Institutional Review Board (ID: 2016-1194).

Experimental design.
Agency task.We designed a novel, child-friendly task to assess participants' valuation of agentic choice in contexts in which the value of that choice varied (Fig. 1a).Participants' goal in the task was to win as many tokens as possible by playing different slot machines that probabilistically paid out 10-token rewards, which they were told would be converted to a monetary bonus at the end of the experiment.Participants had to choose between accepting a variable offer amount (0-6 tokens) and forgoing agency (i.e., allowing a coin flip to randomly determine their machine selection), or rejecting that offer and choosing agency (i.e., selecting one of the machines for themselves).After participants made their agency decision, the task proceeded to the machine-selection stage.If the participants had chosen to forgo agency, they viewed an animated coin flip and were instructed to select the machine that matched the color on which the coin landed.If the offer was rejected, participants selected between the machines for themselves.Once a machine was selected, participants viewed the outcome (either 10 or 0 tokens).If participants chose to forgo agency, they additionally received the tokens offered in the agency-decision stage.Each of the three arcade rooms (b) contained a pair of slot machines that paid 10-token rewards with different probabilities on each trial.
Throughout the task, participants encountered three pairs of slot machines, each of which were housed in different arcade rooms.On every trial, participants completed a machine-selection stage in which they entered one of the rooms and selected between two machines.Participants could learn through trial and error which machines were more likely to pay out tokens.
Critically, the machine-selection stage was always preceded by an agency-decision stage, during which participants saw a door previewing which room they would enter.Participants had to decide whether to choose between the two machines themselves (i.e., choose agency) or whether to let the computer randomly pick one of the two machines for them (i.e., forgo agency).Importantly, on every trial, participants were offered a variable number of tokens (between zero and six) that they would receive if they chose to forgo agency.If they chose to forgo agency, then in the machine-selection stage an animated coin flip determined which machine they had to play.All decisions were self-paced; outcome screens during the machineselection stage were displayed for 1.5 s.
We manipulated the value of choice by varying the reward probabilities of the three pairs of machines in the different rooms (Fig. 1b).In the 50/50 condition, both machines had a 50% probability of paying out tokens.In this uncontrollable environment, choice had no instrumental value-participants' own selections would not yield more reward than the computer's random choices.In the 70/30 and 90/10 conditions, the machines paid out tokens on 70% and 30% and 90% and 10% of trials, respectively.In these two controllable conditions, choice had instrumental value because participants could learn to select machines that would reliably yield more reward than random selections.
Participants completed 315 trials total.Trials were divided into 15 blocks of 21 trials, though these blocks were not signaled to participants.The 21 trials within each task block comprised each pair of slot machines (i.e., 50/50, 70/30, and 90/10 pairs) coupled with each of the seven possible token offers (0-6 tokens) one time.Within each block, the order of the trials was randomized for each participant.
Prior to beginning the task, all participants completed an extensive, child-friendly tutorial with both written and auditory instructions (see the Supplemental Material available online).
Post-task assessments.Immediately following the agency task, participants completed two additional tasks assessing their knowledge of the machines' reward values.In addition, participants also completed a series of questionnaires assessing various individual-difference measures.
We include more details of these measures and related findings in the Supplemental Material.
Analysis approach.We analyzed participants' behavior in two ways.First, we ran regression analyses to determine how features of the task (including the reward probabilities of the machines) and the offer amounts presented influenced the choices participants made at both decision stages.Second, we fitted participants' data with reinforcement-learning models that enabled us to examine how machine-reward probabilities were learned over the course of the experiment and how learning influences choice.We describe both these approaches in more detail in the Results section.We focused on participants' decisions, not their reaction times, but we include additional reaction-time analyses in the Supplemental Material.

Results
Learning machine-reward probabilities.We first confirmed that participants learned to select the more rewarding machines in the 70/30 and 90/10 conditions.We fitted a mixed-effects logistic regression to participants' machine selections (only on free-choice trials; coded as 1 if they selected the higher-value machine and 0 if they selected the lower-value machine; see the Supplemental Material for analysis methods).Condition (70/30 or 90/10), within-condition trial number, continuous age, and their interactions were predictors.Because we were interested in whether participants learned to select the better machines, we did not include trials in the 50/50 condition (in which there was no better machine) or trials in which the computer made machine selections in this analysis.Across conditions, participants made optimal choices at above-chance levels (intercept = 2.4, SE = 0.17, z = 13.8, p < .001;Fig. 2).Participants made more optimal machine selections in the 90/10 condition relative to the 70/30 condition, β = −0.42,SE = .08,χ 2 (1) = 20.86,p < .001.Participants also made increasingly optimal choices across trials, β = 0.71, SE = .08,χ 2 (1) = 60.59,p < .001.There was not a significant main effect of age on optimal choices, β = 0.21, SE = 0.17, χ 2 (1) = 1.46, p = .227,nor was there a significant Age × Trial interaction effect, β = 0.10, SE = 0.08, χ 2 (1) = 1.46, p = .226.No other effects or interactions reached significance ( ps > .22;see the Supplemental Material for full results).These findings indicate that across age, participants learned to select the better machines.
Sensitivity to the intrinsic and instrumental value of choice.After establishing that participants learned to select the more rewarding slot machines, we examined whether they used the machine-reward probabilities to guide their agency decisions.We formalized the comparison between choosing and forgoing agency by computing their expected values (EVs).On each trial, we defined EV choose as the maximum expected value of the two machines (i.e., 5, 7, or 9 tokens), assuming that participants would select the highervalue machine.We defined EV forgo as the average expected value of the two machines (because the computer has a 50% chance of selecting each machine), plus the offer amount.We defined the value of choice (VoC) as the difference between EV choose and EV forgo .Positive values indicate that choosing agency is optimal, whereas negative values indicate that forgoing agency is better.To examine how the value of choice influenced participants' agency decisions, we fitted a mixed-effects logistic regression with value of choice, within-condition trial, continuous age, and their interactions as predictors.Here, we included all trials.Because our value of choice variable incorporates information about the machine-reward probabilities, we did not include a separate condition variable in the model.Participants demonstrated sensitivity to the value of choice: They were more likely to choose agency when doing so had higher expected value, β = 1.42,SE = 0.07, χ 2 (1) = 144.39,p < .001(Fig. 3a).Moreover, we observed an Age × VoC interaction, β = 0.15, SE = 0.07, χ 2 (1) = 4.17, p = .041,as well-older participants demonstrated greater sensitivity to the value of choice (Fig. 3b).A VoC × Trial interaction effect, β = 0.33, SE = 0.04, χ 2 (1) = 50.21,p < .001,revealed that sensitivity to the value of choice also increased across the task.Further, the extent to which sensitivity to the value of choice increased across trials varied across age, with older participants demonstrating the greatest increases in the effect of the value of choice on agency decisions across the experiment-VoC × Age × Trial effect: β = 0.12, SE = 0.04, χ 2 (1) = 9.59, p = .002(Fig. 3c).
There was not a significant main effect of age on agency decisions, β = −0.02,SE = 0.14, χ 2 (1) = 0.01, p = .906;participants across age demonstrated a strong bias toward choosing agency, even when doing so was not beneficial.When we restricted our analysis to trials in which value of choice was 0 (meaning the expected values of choosing versus forgoing agency were equivalent), we similarly did not observe a significant effect of age on agency decisions, β = 0.01, SE = 0.12, χ 2 (1) = 0.00, p = .961.Across age, when the value of choice was 0, participants chose agency on 73.2% of trials (SE = 2%; see Fig. 3a).choice measure assumes that participants had perfect knowledge of the machines' reward probabilities from the beginning of the task.Consequently, age differences in how participants learned the values of the machines across trials may influence decisions about choosing versus forgoing agency.

Reinforcement-learning modeling results
To examine how participants learned to make choices throughout the task, we fitted variants of a reinforcement-learning model to our choice data.All model variants assumed that the participant learned the values of the six slot machines through experience.After selecting (or seeing the computer select) a machine and observing the outcome (r), the estimated value of the machine is updated such that where α is a participant-specific learning rate.The model then uses these estimated machine values to compute the value of choosing and forgoing agentic choice at the agency-decision stage on each trial.Following the same logic we used to compute the value of choice for our regression analysis, if the participant chooses agency, then at the machine-selection stage they will then be able to select the machine that they believe is better, so the value of choosing agency is the maximum estimated value of the two upcoming machines.Here, we add to this value a participant-specific "agency bonus" that captures the intrinsic reward that each participant places on agentic choice.Positive and negative values of the agency-bonus parameter reflect inflated and deflated valuation of agentic choice, respectively.Thus, on each trial, the value of agentic choice is computed as where V machine1 reflects the estimated value of the machine on the left and V machine2 reflects the estimated value of the machine on the right.The model maintains six value estimates, corresponding to the six machines experienced throughout the task, and uses those corresponding to the two presented machines on each trial.
If the participant chooses to forgo agentic choice, the computer will select randomly between the two machines.Thus, the value of forgoing agentic choice is the average of the estimated values of the two upcoming machines, plus the token offer amount the participant will receive on that trial: , .

( ) +
(3) Although the agency bonus and token offer amount take similar forms, unlike the agency bonus-which is a free parameter fitted to each participant's choices that remains stable across choices-the token offer is a hard-coded feature of the experimental design that varies across trials.
At both decision stages, estimated values are converted into choice probabilities via softmax functions, in which decision noise or stochasticity is determined by participant-specific inverse temperature parameters.Higher inverse temperatures reflect decisions that are more deterministically governed by estimated values, whereas lower inverse temperatures reflect decisions that are less sensitive to value estimates.We fitted variants of this model with different numbers of learning rate, inverse temperature, and agencybonus parameters.We fitted models with a single learning rate for all trials, two learning rates that allowed for differential value updating following self-versus computer-made machine selections, two learning rates that allowed for differential value updating following wins and losses, and four learning rates that allowed for differential value updating across both self-versus computer-made selections and wins and losses.We also fitted models with a single inverse temperature parameter that governed the extent to which both first-stage agency decisions and second-stage machine selections were value driven, as well as two inverse temperatures that allowed the noisiness of decisions to vary across stages.Finally, we fitted models with and without the agency-bonus parameter.In total, we fitted 16 model variants, comprising all combinations of one, two, and four learning rates, one and two inverse temperatures, and zero and one agency bonuses.Models were fitted individually to each participant's data, using common priors.We include more details about our model-fitting procedure as well as model recoverability and validation in the Supplemental Material.
Both at the whole-group level, and within each age group, the best-fitting reinforcement-learning model included seven free parameters (Fig. 4): four learning rate parameters (α choice + , α choice − , α comp + , α comp − ) that govern the extent to which participants update their beliefs about the values of the machines following their own versus the computer's choices after wins and losses; two inverse temperature parameters (β agency and β machine ) that determine the extent to which participants' value estimates influenced their agency-decision and machineselection choices; and an agency bonus that was added to the value of choosing agency to account for individual biases toward choosing or forgoing agency.The second-best model in terms of fit, which had only one inverse temperature parameter, similarly well captured participants' choices (Akaike information criterion, or AIC, difference from best-fitting model = 1.9).However, because our best-fitting model nested this simpler model (and all other simpler models) within it, we chose to focus our analyses on estimated parameter values from the better-fitting, more complex model only.
We confirmed via simulations that parameter estimates from this model were recoverable-in simulations, correlations between true, generating parameter values and estimated parameter values ranged from .79 to .96. (See the Supplemental Material for details.) Parameter estimates from this winning model can help clarify whether age differences in learning the machine values, or in using value estimates to guide choice, or both, contributed to age-related change in sensitivity to value of choice.We first analyzed how learning-rate parameters, which reflect how participants updated their beliefs about the value of the machines across trials, differed across agency decisions (i.e., after choosing or forgoing agency), outcome valence (i.e., wins or losses), continuous age, and their interactions via a mixed-effects linear regression.We observed an Agency Decision × Outcome Valence interaction, β = −0.04,SE = 0.01, F(1, 270) = 10.0, p = .002(Fig. 5).Post hoc paired t tests indicated that participants demonstrated a confirmation bias: They weighted recent wins more heavily than recent losses following their own machine selections-t(91) = 3.2, p = .002,mean α choice + = .24(SE = 0.03), mean α choice − = .11(SE = .02)-butnot following selections made by the computer-t(91) = −0.87,p = .386,mean α comp + = .15(SE = .03),mean α comp − = .18(SE = .03).Age did not relate to learning rates, β = 0.01, SE = 0.01, F(1, 90) = 0.52, p = .473,nor were there significant interactions between age and either agency decisions or outcome valence on learning-rate magnitudes (ps > .24).Next, we analyzed age differences in the parameters that influenced participants' agency decisions.To test whether biases toward choosing agency varied with age, we ran a linear regression to examine the relation between age and agency bonuses.We found that agency bonuses did not vary significantly with age, b = 0.01, SE = 0.01, p = .269.In accordance with our behavioral findings, participants' average agency bonus was .32 (SE = .04),indicating that the intrinsic reward of agentic choice was approximately 3.2 tokens.
To test for age-related change in the use of learned value to guide agency decisions, we examined the extent to which participants' agency decisions were sensitive to their own subjective valuation of choice.Within the model, the β agency parameter characterizes the extent to which participants' agency decisions were guided by their own estimates of the values of choosing and forgoing agentic choice-estimates that reflected both their own, idiosyncratic value-learning processes and their own agency biases.Higher values of β agency reflect a greater use of one's own subjective valuation of choice to guide agency decisions.We found that β agency marginally increased with age, b = 0.24, SE = 0.12, p = .056,mean β agency = 9.33, SE = 0.59.This analysis corroborates our model-free regression results and suggests that participants' use of the subjective value of choice to guide agency decisions increased across development.
When examined in isolation, we did not observe an effect of age on β machine values (b = 0.09, SE = 0.11, p = .410,mean β machine = 7.44, SE = 0.52), indicating that stochasticity in participants' machine selections did not significantly vary across age.In an additional exploratory analysis (conducted after we submitted our Experiment 2 preregistration), we tested whether age effects on β agency and β machine differed by including them as dependent variables in the same linear mixed-effects model, with decision stage and age as interacting predictors.Here, we did not observe a significant Decision Stage × Age interaction effect on estimates of β values, b = 0.34, SE = 0.29, F(1, 90) = 1.41, p = .238.Thus, even though β machine did not vary with age when examined in isolation (while β agency showed a marginal relation), we cannot reject the null hypothesis that both inverse temperatures followed similar trajectories of age-related change.

Experiment 1 Discussion
Findings from Experiment 1 revealed an age-varying influence on instrumental value and an age-invariant influence of intrinsic value on agentic choice.That is, from middle childhood to early adulthood, participants were increasingly likely to decide to make their own choices when doing so would lead to more reward gain.Across age, however, participants consistently overvalued the opportunity to make choices.Together, these results suggest that distinct cognitive and motivational mechanisms may influence when and how people seek opportunities to control their environments.Nevertheless, across both analytic approaches, the interaction between age and instrumental value on agentic choice was modest.Thus, to ensure this finding was replicable, we conducted an additional online study in which a larger sample of participants completed the same reinforcement-learning task.

Method
After analyzing data from Experiment 1, we conducted an online preregistered replication study (N = 150, ages 8-25 years).In prior work, we have shown that with appropriate precautions, the developmental decisionmaking data we collect online looks largely similar to the data we collect in in-person laboratory experiments (Nussenbaum, Scheuplein, et al., 2020).Before beginning data collection, we specified a target sample size of 150 participants on the basis of the size of the effect of age on β agency in the Experiment 1 data (adjusted R 2 = .052).On the basis of our original analysis, we determined that including 150 participants would give us more than 80% power to detect an effect of the same size.After completing data collection for Experiment 2, we discovered that our original analysis code had a minor bug; when we fixed it (new adjusted R 2 = .029),we determined that with a sample size of 150, we had 57% power to detect an effect of age on β agency .We recruited participants as young as 10 years old for our in-person experiment, but we a priori decided to extend our age range down to 8 years for the online replication in order to characterize developmental changes in agentic choice across a wider age range.This decision likely also increased our power to detect age-related changes in our measures of interest.
Though we made several minor modifications to the task in order to administer it remotely and asynchronously (described in detail in the Supplemental Material), all relevant manipulations and task statistics (e.g., number of trials, reward probabilities, token offer amounts) remained identical to those used in the Experiment 1 task.Participants were recruited from across the United States via ads on social media, in-person science outreach events, flyers, and word of mouth.
Adult participants and parents of minors provided informed consent; participants under 18 years of age provided assent to participate.Participants were paid $10 plus a $5 performance bonus.Though we treated age continuously in all statistical analyses, we separated participants into age groups for recruitment and datavisualization purposes.In all, we collected data from 164 participants; after applying preregistered exclusion criteria to filter out inattentive participants (see the Supplemental Material), our final sample comprised 150 children (n = 50, ages 8-12 years, M = 10.5 years, SD = 1.4 years, 25 females, 25 males), adolescents (n = 50, ages 13-17 years, M = 15.4 years, SD = 1.5 years, 26 females, 24 males), and adults (n = 50, ages 18-25 years, M = 22.0 years, SD = 2.1 years, 22 females, 23 males, 5 other).Participants' self-identified race and ethnicity were as follows: 53.3% White, 26.7% Asian, 11.3% more than one race, 8.0% Black, and 1% Native American.In addition, 12.7% of participants identified as Hispanic.Research procedures were approved by New York University's Institutional Review Board (ID: 2021-5210).

Results
We followed the same analytic approach as in Experiment 1 to examine participants' machine selections and agency decisions.In brief, we replicated our prior findings.Because the primary goal of this replication study was to examine evidence for age-related change in the influence of instrumental value on agentic choice and age invariance in the influence of intrinsic value, we focus on key tests of these hypotheses, and we include a detailed description of all Experiment 2 findings in the Supplemental Material.
As in Experiment 1, to examine the influence of instrumental value on agency decisions, we examined both the effect of the value of choice on first-stage agency choices as well as estimates of β agency derived from our reinforcement-learning model.Critically, we replicated our original finding of an Age × VoC interaction effect on agentic choice, β = 0.22, SE = .06,χ 2 (1) = 12.28, p < .001:Older participants demonstrated greater sensitivity to the value of choice (Fig. S9) when making their first-stage agency decisions.As in Experiment 1, we also found that sensitivity to value of choice increased across the task-VoC × Trial interaction effect: β = 0.20, SE = 0.03, χ 2 (1) = 51.94,p < .001.Increases in sensitivity to the value of choice was greatest at older ages-VoC × Trial × Age effect: β = 0.06, SE = 0.03, χ 2 (1) = 5.26, p = .022.Here, too, we found that β agency , which captures the extent to which participants' agency decisions were guided by their own learned estimates of the value of agentic choice, increased with age, b = 0.25, SE = 0.09, p = .008.As in Experiment 1, we did not observe a significant effect of age on β machine when examined in isolation, b = 0.14, SE = 0.08, p = .081,but in an exploratory, nonpreregistered analysis examining both β machine and β agency estimates together, we also did not observe a significant interaction between age and decision stage on inverse temperatures, b = 0.27, SE = 0.21, F(1, 148) = 1.55, p = .216.
To examine the influence of intrinsic value on agentic choice, we examined the influence of age on agency decisions, as well as model-derived estimates of participants' agency bonuses.We did not observe evidence for a significant influence of age on agency decisions, β = 0.01, SE = 0.17, χ 2 (1) = 0.00, p = .965;participants across age demonstrated a strong bias toward choosing agency, even when doing so was not beneficial.Across age, when value of choice was 0, participants chose agency on 75.8% of trials (SE = 2%; see Fig. S9).Corroborating these findings, agency bonuses similarly did not vary significantly with age, b = −0.01,SE = 0.01, p = .242.
Finally, as in Experiment 1, we found that learning from the outcomes of the slot machines varied depending on whether the computer or participant made the machine selection and as a function of outcome valence.Specifically, we again observed an Agency Decision × Outcome Valence interaction, β = −0.06,SE = 0.01, F(1, 444) = 28.34,p < .001.As in Experiment 1, post hoc paired t tests indicated that participants demonstrated a confirmation bias following their own machine selections, t(149) = 7.0, p < .001,mean α choice + = .30(SE = .02),mean α choice − = .12(SE = .02),but not following those made by the computer, t(149) = −1.1,p = .269,mean α comp + = .18(SE = .02),mean α comp − = .22(SE = .03).Here, we also observed decreasing learning rates with increasing age, β = −0.03,SE = 0.01, F(1, 148) = 5.38, p = .022.Taken together, findings from our preregistered online replication study mirrored those from our original in-person experiment, providing additional evidence for distinct developmental trajectories of the influence of intrinsic and instrumental value on agentic choice.

General Discussion
Here, we investigated how intrinsic and instrumental value shape agentic-choice preferences across development.Across two experiments, we found that from childhood to adulthood, participants demonstrated a consistent agentic-choice bias: They preferred to make choices even when forgoing agency would lead to greater reward.Both computational model-based analyses and a simpler regression revealed that this biaswhich we interpret as reflecting participants' intrinsic valuation of choice-did not vary significantly in magnitude from middle childhood into early adulthood.Moreover, we found that participants' agency decisions were also sensitive to the instrumental value of choice, such that participants were increasingly likely to choose agency on trials in which doing so would lead to more reward.An interaction between age and the value of choice on agency decisions revealed that this sensitivity increased with age.Critically, parameter estimates from our fitted reinforcement-learning model revealed that age-related increases in the calibration of agency decisions to different contexts were not solely due to agerelated changes in learning the rewards that different actions were likely to yield; even when accounting for individual and developmental differences in participants' learning, the use of beliefs about the instrumental value of choice to guide agency decisions increased with age.
Multiple cognitive mechanisms may have contributed to age-related change in sensitivity to the instrumental value of choice.Older participants' greater use of the value of choice to guide their agency decisions may have been driven by developmental improvements in the ability to accurately compute the expected value of choosing versus forgoing agency.Despite demonstrating effective learning of the machine values, younger participants may have had more difficulty than older participants in integrating learned machine values with explicit token offer amounts to compute the overall expected values of choosing versus forgoing agency.Prior work has demonstrated that expected-value estimation improves into adulthood (Rosenbaum & Hartley, 2019), as mathematical and probabilistic reasoning abilities develop (Donati et al., 2014;Geary, 2006;Schlottmann & Anderson, 1994).In addition, in this task, to effectively use the value of choice to guide their agency choices participants had to think one step into the future, determining their first-stage agency decisions on the basis of the rewards they expected to earn from their second-stage machine selections.The ability to effectively plan multiple steps into the future improves across childhood and adolescence (Albert & Steinberg, 2011;Decker et al., 2016;Ma et al., 2022;Nussenbaum, Scheuplein, et al., 2020;Potter et al., 2017) and may also contribute to the use of instrumental value to guide agency choices.
Age-related change in sensitivity to the instrumental value of choice may also reflect general age-related decreases in decision noise, as has been observed in prior studies (Nussenbaum & Hartley, 2019).Indeed, we observed mixed evidence for the specificity of the age effect of the value of choice on agency decisionsolder participants in our task may have made more value-driven decisions overall, not just at the first stage in which they made their agency choice.Age-related decreases in decision noise could reflect improvements in value computation or shifts from a more exploratory to a more exploitative choice strategy (Giron et al., 2023;Gopnik, 2020).Here, our primary aim was to examine whether sensitivity to the instrumental and intrinsic value of choice varied with age.However, future work could tease apart the various potential influences on age-related change in sensitivity to the instrumental value of choice by more directly manipulating the complexity of the value computation involved in agency versus other types of decisions.For example, a variant of this experiment could present participants with explicit information about the value of choice on some trials and examine how this reduction in computational complexity influences agency decisions from childhood to adulthood.Future work could also include trials in which participants have to make equivalently complex two-stage decisions, but where the first stage does not involve choosing or forgoing agency.Differences in value-guided behavior across agency versus nonagency decisions could further elucidate whether the developmental trajectories of value computations that involve explicit consideration of oneself as an agentic being differ from those that do not; it may be that this explicit consideration is particularly demanding earlier in life, as children learn to weigh their beliefs about their own efficacy in bringing about desired outcomes with the potential costs of exerting control (Shenhav et al., 2021).
Across age, we observed a consistent preference for agentic choice, such that on average, participants sacrificed more than three tokens to select between the machines themselves.This bias toward agentic choice has been observed in many prior studies (Ackerlund Brandt et al., 2015;Bobadilla-Suarez et al., 2017;Bown et al., 2003;Cockburn et al., 2014;Leotti & Delgado, 2011;Munuera et al., 2022;Wang & Delgado, 2019;Wang et al., 2021), and it may be generally adaptive, particularly early in development: Choice provides the opportunity to learn whether actions are causally efficacious, promoting knowledge of environmental structure and estimates of one's agency in the world.However, people do not always value opportunities to choose.Unlike the relatively simple and low-stakes choices participants faced in our task, other decisions can be difficult or anxiety-provoking (Iyengar & Lepper, 2000;Patall, 2012;Shenhav & Buckner, 2014;Sidarus et al., 2019), potentially reducing the hedonic properties of choice.In addition, some choices may be relatively insignificant and may thus not warrant cognitive effort (Boureau et al., 2015).Making decisions can also be unpleasant when choosing between aversive, mundane, or numerous options (Botti et al., 2009;Iyengar & Lepper, 2000;Leotti & Delgado, 2014;Shenhav et al., 2018).The learning context of our task may have also introduced additional motivations for agentic choiceparticipants may have wanted to resolve epistemic uncertainty about specific machine options, leading them to choose agency so that they could explore strategically, even when they knew it was likely to lead to less reward on any given trial (Meder et al., 2021;Molinaro et al., 2023;Nussenbaum et al., 2023;Somerville et al., 2017).A more complete and general account of age-related change in agentic-choice preferences will require replicating and extending our findings to varied contexts in which different properties of choice-like its difficulty, valence, and utility in resolving uncertaintysystematically vary.
Here, we also found that making choices influenced how participants learned about the value of different actions.Our analysis of model-derived learning rates revealed that across age, when participants selected between the machines themselves, they updated their beliefs to a greater extent following wins versus losses.However, they did not demonstrate this learning-rate asymmetry when the computer selected between the machines.This type of confirmation bias (Palminteri & Lebreton, 2022) may cause participants to persistently overestimate the rewards they earn by making their own choices.It is possible that over long timescales such learning distortions contribute to agentic choice acquiring intrinsic value.Whereas the present study focused on examining the relative contributions of instrumental and intrinsic value to agentic choice preferences across age, future work could further decompose and assess the factors that give rise to the intrinsic value of choice in the first place.It may be that the age invariance of the intrinsic value of choice that we observed emerges from combinations of different factors (e.g., learning biases, reward sensitivity) that change with experience over developmental time.
Future work will also be required to test the generalizability of our findings.Participants in this study were a community sample of children, adolescents, and young adults from the New York City (NYC) area (Experiment 1) and from the United States (Experiment 2).Prior analyses of participants in a different behavioral experiment in our lab (Nussenbaum, Prentis, & Hartley, 2020) who were drawn from the same database as participants in Experiment 1 revealed that in-person participants tended to come from homes with two to three times the average annual income of the surrounding community (average annual household income of lab sample: $153,137; NYC average annual household income: $55,191) and with higher levels of parental education (average years of parental education of lab sample: 16.7-i.e., college degree; percentage of NYC adults over age 25 with a bachelor's degree: 36.2%).Similarly, online study participants in Experiment 2 came from homes with higher annual household incomes than the U.S. population from which they were sampled (median annual household income of study participants: $100,000-$200,000; USA median annual household income: $74,580 in 2022).Prior research has revealed that the development of basic learning and choice mechanisms are influenced by the presence of both stress (Hanson et al., 2017;Harms et al., 2018) and enrichment opportunities (Amso et al., 2019;Sheridan et al., 2017) in early-life environments, which may vary systematically with socioeconomic status.It is possible that developmental trajectories of sensitivity to the instrumental value of choice are similarly influenced by early-life exposure to stress and enrichment.Further, the development of agentic choice preferences themselves may be highly dependent on early experiences during which people learn the efficacy of their own actions-an extensive cross-species literature has revealed that exposure to uncontrollable stressors may lead to "learned helplessness" (Maier & Seligman, 2016), so that organisms stop seeking opportunities for agentic choice, even in controllable environments.Finally, there may also be cultural differences in agentic-choice preferences-individuals who grow up in different sociocultural environments may form different beliefs about self-efficacy (Oettingen, 1995) and the value of making their own choices.
Here, by developing a novel task, we sought to address how preferences for agentic choice-and critically, the flexibility of those preferences to adapt to different contexts-change across development.From early in life, people learn to act as agentic beings in the world, tailoring their behavior to exert causal effects on their environments.Critically, each instance of acting in the world is preceded by a decision about whether to act freely.Here, we found evidence for distinct developmental trajectories of sensitivity to the intrinsic and instrumental value of agentic choice-a decoupling that may be adaptive.The early-emerging, age-invariant, positive intrinsic value of choice instills a bias toward action that may promote greater opportunity for individuals to learn about their sphere of influence in the world.Later, this simple bias toward choice may be increasingly augmented by a more complex algorithm that informs when to forgo such opportunities.Together, bias and flexibility in people's decisions to seek or forgo opportunities for choice may underlie the development of adaptive agentic action.

Fig. 1 .
Fig.1.Example trial of agency task.Each trial (a) began with the agency-decision stage, in which participants viewed the upcoming arcade room and slot-machine pair.Participants had to choose between accepting a variable offer amount (0-6 tokens) and forgoing agency (i.e., allowing a coin flip to randomly determine their machine selection), or rejecting that offer and choosing agency (i.e., selecting one of the machines for themselves).After participants made their agency decision, the task proceeded to the machine-selection stage.If the participants had chosen to forgo agency, they viewed an animated coin flip and were instructed to select the machine that matched the color on which the coin landed.If the offer was rejected, participants selected between the machines for themselves.Once a machine was selected, participants viewed the outcome (either 10 or 0 tokens).If participants chose to forgo agency, they additionally received the tokens offered in the agency-decision stage.Each of the three arcade rooms (b) contained a pair of slot machines that paid 10-token rewards with different probabilities on each trial.

FirstFig. 3 .
Fig.3.Agency decisions across the experiment.Participants across age demonstrated a bias toward choosing agency even when its expected value was negative.In addition, participants were increasingly likely to choose agency as the value of choice increased, with older participants demonstrating a stronger influence of the value of choice on their agency decisions.Moreover, the effect of the value of choice on agency decisions increased across trials.In (a), points show the average proportion of trials in which participants across age chose agency at each value of choice level; error bars show standard errors of the mean.Trials were separated into two bins and participants into three age groups for visualization purposes.Both variables were treated continuously in statistical analyses.An optimal participant should always choose agency when the value of choice is positive and to forgo agency when the value of choice is negative.In (b) and (c), points show the fixed effects plus participant-specific random slopes of the value of choice (b) and VoC × Trial (c) from a mixed-effects logistic regression model examining the influence of the value of choice, trial, and their interaction on agency decisions.The line shows the best-fitting linear regression through the points, with the shaded region showing 95% confidence intervals.

Fig. 4 .
Fig. 4. Model comparison results.Akaike information criterion (AIC) values indicated that across age groups, the fourα_twoβ model with four learning rates, two inverse temperatures, and an agency bonus best fit the data.Bars in (a) show average AIC values for each model within each age group; bars in (b) show the difference between the overall mean AIC of the best-fitting model (fourα_twoβ) and the mean AIC values of the other models with an agency bonus in the comparison set.

Fig. 5 .
Fig. 5. Model parameters.The best-fitting reinforcement-learning model (a) included seven free parameters that captured individual differences in how participants learned the values of the slot machines and used those values to guide their first-and second-stage choices.Points show individual participants' parameter estimates; the lines represent the best-fitting linear regression through the points, and the shaded regions show 95% confidence intervals.Participants demonstrated a confirmation bias (b) in which they updated their beliefs about the values of the machines more following positive versus negative outcomes; however, this was observed only when they selected the machines on their own.Error bars show standard errors across participant means.

.
Our analyses of participants' agency decisions indicate that sensitivity to the value of choice increased both across development and across trials.However, our value of Fig. 2. Optimal choices at the machine-selection stage.Across trials (here plotted across blocks of 21 trials for visualization) and conditions, participants learned to select the more rewarding slot machines.Learning performance did not significantly vary across age.Smaller points indicate participant-level averages, large points indicate age-group averages, lines indicate the best-fitting linear regression (fitted to age-group averages), and shaded regions represent 95% confidence intervals.The dashed lines indicate chance-level performance.Participants were separated into three age groups for visualization; age was treated continuously in statistical analyses.