Abstract
The distinction between buying and “just browsing” illustrates how people can evaluate potential rewards with or without the intent to choose between them. A common network has been implicated across these two decision contexts, including regions of ventromedial prefrontal cortex and the posterior midline. However, recent work has begun to dissociate sub-components of this reward circuit, distinguishing a medial orbitofrontal (mOFC) Network from a rostral anterior cingulate (rACC) Network. These findings suggest that the rACC Network may play a relatively automatic role in appraising choice options whereas the mOFC Network may instead be more involved in choice comparison. We test this hypothesis by varying an individual’s goals when approaching an option set. Participants undergoing fMRI were instructed to appraise how much they liked a set of products (Like) or to choose the product they most prefer (Choose). Set appraisal was driven by the average value of the items in a set, and correlated with activity in the rACC Network. Critically, this network tracked set liking when it was task-relevant (Like trials) and task-irrelevant (Choose trials). The mOFC Network was sensitive to evaluation condition, more active during Choose than Like trials. These regions dissociate, with mOFC selective for evaluation type but not appraisal, whereas the reverse was true for rACC. rACC additionally tracked how certain the participant was in both types of evaluations. These findings are consistent with the possibility that different circuits are involved in appraising the overall value of a set of options versus choosing which option is best.
Significance statement People are capable of evaluating items to choose amongst them or to simply appraise (“browse”) the set. Despite both tasks requiring an evaluation of one’s options, choice and appraisal are associated with different phenomenological experiences. It has therefore been proposed that these processes draw on different but adjacent neural circuits, with an appraisal-related network triggering more automatic reactions to one’s options and a choice-related network engaging in explicit comparison between them. We directly test this hypothesis, showing that these forms of evaluation engage dissociable components of a broader reward circuit. These findings suggest that decisions about how good one’s options are (and possibly whether to approach them) are driven by different mechanisms than decisions about which option is best.
When evaluating a variety of options – from candy bars to cars to houses – people can take on one of two roles: (a) actively choosing between the items or (b) “just browsing” (i.e., appraising the available options). These modes of evaluation seem to be distinguishable phenomenologically, yet little is known about the degree to which they draw on shared or distinct mechanisms. Different lines of research have examined the neural mechanisms associated with appraising the value of an isolated item (Knutson et al., 2007; Plassmann et al., 2007; Lebreton et al., 2009) and others have examined the process of choosing one item from a set (Levy and Glimcher, 2012; Bartra et al., 2013; Clithero and Rangel, 2013). However, it remains unclear how people appraise a set of items when they don’t have the explicit goal of selecting between them. Put simply, to what extent are the mechanisms involved in browsing also involved in deciding what to buy?
Overlapping circuitry has been implicated in representing single-item appraisal and multi-item choice (Knutson et al., 2007; Peters and Buchel, 2010; Levy and Glimcher, 2012; Bartra et al., 2013; Clithero and Rangel, 2013). This includes regions of ventral striatum, ventromedial prefrontal cortex (vmPFC), and the posterior midline. However, recent studies suggest that finer distinctions can be made within this circuit. Functional and structural distinctions have been documented between more dorsal and more ventral regions of vmPFC, and similarly for the posterior midline (Price and Drevets, 2010; Smith et al., 2010; Roy et al., 2012; Wallis, 2012; Clithero and Rangel, 2013; Córcoles-Parada et al., 2017). The same sets of regions have been dissociated using resting-state functional connectivity, between sub-networks of the Default Mode Network (Vincent et al., 2006; Andrews-Hanna et al., 2010; Shenhav and Buckner, 2014; Christoff et al., 2016).
Recent work has provided indirect evidence suggesting that these separable components of the reward network may be differentially involved in appraisal versus choice (Shenhav & Buckner, 2014) (Figure 2A). In these studies, an mOFC Network – consisting of medial orbitofrontal cortex (mOFC), retrosplenial cortex (RSC), and left middle frontal gyrus (lMFG) – was sensitive to a combination of choice difficulty and the value of one’s choice options. This pattern was consistent with prior evidence of mOFC’s involvement in value-based choice comparison (e.g. Fellows, 2006; Noonan, Kolling, Walton, & Rushworth, 2012; Strait, Blanchard, & Hayden, 2014). A second network – consisting of rostral (pregenual) anterior cingulate cortex (rACC), posterior cingulate cortex (PCC), and ventral striatum (VS) – tracked the positive affect that was evoked by the choice options, but not the difficulty of choosing. Activity in this rACC Network was consistent with a more automatic or reflexive appraisal of these items rather than a process of comparing between them (cf. Lebreton et al., 2009; Grabenhorst and Rolls, 2011).
Based on such findings, we make two key predictions concerning the different roles these networks may play when people are “browsing” a set of items versus selecting from among them. First, we predict that the mOFC Network will be more active when participants are making a choice relative to when they are appraising a set. Second, we predict that the rACC Network will consistently signal the overall value of a set of items, regardless of whether the overall set value is task-relevant (appraisal) or task-irrelevant (choice).
We test these predictions in an fMRI study using a “Like vs. Choose” task. Participants viewed sets of products and either estimated how much they liked the whole set or selected the product they preferred most from it. Critically, both decisions required participants to consider the value of each item in the set, but differed in terms of whether they require a composite of those values or a comparison between them. As predicted, we find that the rACC Network signaled overall set liking irrespective of the task at hand, and that the mOFC Network was more active when participants were choosing rather than appraising. Our task design also allowed us to compare between neural signals related to decision certainty for each of these types of evaluation. We show that both types of certainty are encoded in the rACC Network, independently of set liking. Collectively, these findings point to separable mechanisms for browsing versus choosing, suggesting that the circuits that draw us to the store window may be different than those that guide our in-store purchases.
Materials and Methods
Participants
31 individuals (54.8% female, Mage = 25.0, SDage = 4.4) participated in this study. Of these, one was excluded for excessive head motion and three were excluded due to an error in recording behavioral responses on the Like/Choose Task, leaving 27 individuals (55.6% female, Mage = 24.2, SDage = 4.0) in the final analysis.
Experimental Design
Participants performed the study in three phases occurring sequentially within the same experimental session. In Phase 1, participants evaluated how much they would hypothetically like to have each of a series of products on a scale of 0 (‘not at all’) to 10 (‘a great deal’). The product sets were partially tailored in relation to the participant’s gender (total products evaluated: males = 310, females = 328).
In Phase 2, participants performed the Like/Choose Task (LCT) while in the scanner (Figure 1). On each trial of this task, participants were shown four of the previously-rated products and asked to either evaluate the set as a whole (Like trials; 1-4 Likert scale ranging from lowest to highest set attractiveness) or select the product they most prefer (Choose trials). In order to ensure that participants were able to view all of the products displayed before the explicit evaluation period, the products appeared for 3s at the start of each trial without information about the type of evaluation required. A LIKE or CHOOSE cue then appeared on the screen to indicate the task on the current trial. In order to separate evaluation and response selection periods, after the cue appeared participants were given an unlimited amount of time to make their decision and instructed to press a key once they had made their decision. This keypress ended the Evaluation period and a fixation cross was shown on the screen for a variable inter-trial interval (ITI; 2-7s), followed by the Selection period.
For Like trials, the Selection period consisted of the numbers 1-4 appearing at the bottom of the screen and the four products appearing at the top of the screen. Both numbers and products were shown in a random horizontal arrangement. Participants used the keypad to move a cursor left (right index finger) or right (right ring finger) before submitting their response (right middle finger). Participants were required to indicate a response within 5s during the Selection period. This deadline was intended to reinforce the idea that the response should have already been determined during the Evaluation period. Subsequent RT analyses confirmed that participants were conforming to these expectations (see Behavioral Results). The Selection period for Choose trials took place in a nearly identical manner, except that the liking rating numbers were replaced by # symbols, and participants moved the cursor to indicate the item they wished to choose (See Figure 1).
Each of the 120 LCT trials featured a product set generated based on the participants’ own preferences. Briefly, products were rank-ordered on the basis of Phase 1 ratings, and the resulting distribution split into tertiles (Low, Mid, High). Similar-value product sets were generated by selecting 60 non-overlapping sequences of four consecutively rank-ordered products (20 sets from each tertile). Mixed-value sets were generated by randomly sampling four products from across the entire value distribution (without replacement). Sets were constructed such that each product would appear exactly twice (once in a similar-value set, once in a mixed-value set). Each product set was only seen once while in the scanner, either in the Like or the Choose condition.
After exiting the scanner, participants completed a counterbalanced version of the LCT they had performed in the scanner (i.e. providing Like ratings for sets that had been presented in the Choose condition, and vice versa). They then also rated their anxiety and confidence associated with each choice; these ratings were taken in service of a separate set of hypotheses, and are not reported further here.
Neuroimaging Parameters
Scans were acquired on a Siemens Trio 3T scanner with a 12-channel phase-arrayed head coil, using the following gradient-echo planar imaging (EPI) sequence parameters: repetition time (TR) = 2500 ms; echo time (TE) = 30 ms; flip angle (FA) = 90°; 2.5mm voxels; 0.50 mm gap between slices; field of view (FOV): 210 × 210; interleaved acquisition; 37 slices. To reduce signal dropout in regions of interest, we used a rotated slice prescription (30° relative to AC/PC) and modified z-shim prepulse sequence. The slice prescription encompassed all ventral cortical structures but limited regions of dorsal posterior parietal cortex. Structural data were collected with T1-weighted multi-echo magnetization prepared rapid acquisition gradient echo image (MEMPRAGE) sequences using the following parameters: TR = 2200 ms; TE = 1.54 ms; FA = 7°; 1.2 mm isotropic voxels; FOV = 192 X 192. Head motion was restricted with a pillow and padded head clamps. Stimuli were generated using Matlab’s Psychophysics Toolbox and were viewed through a mirror mounted on the head coil. Participants used their right hand to respond with an MR-safe response keypad.
Statistical Analysis: Behavioral Data
Behavioral data were analyzed with mixed-effects regressions, accounting for individual subject variance as random effects. Evaluation times were positively skewed and so were log-transformed before being analyzed. We analyzed decision certainty by using indices previously validated for estimating certainty (or overall strength of evidence) for the respective form of evaluation (Krajbich and Rangel, 2011; Lebreton et al., 2015; Solway and Botvinick, 2015). For Choose trials, certainty was estimated as the absolute difference between the value of the chosen item and the average of the remaining items (based on Phase 1 ratings). For Like trials, it was estimated based on the extremity of the participant’s Like rating on a given trial, with a binary variable classifying responses of 1 or 4 (least or greatest liking) as high certainty and 2 or 3 (intermediate liking) as low certainty. We further analyzed how appraisal and evaluation time varied with the overall value of the set, calculated as the average value of the items in the set. Given that the number of items is held constant across trials, this study was not designed to distinguish between effects related to this average value estimate and those related to total set value.
Statistical Analysis: Neuroimaging Data
Preprocessing
fMRI data were analyzed using SPM8 (Wellcome Department of Imaging Neuroscience, Institute of Neurology, London, UK). Preprocessing consisted of realigning volumes within participant, resampling to 2mm isotropic voxels, nonlinear transformation to align with a canonical T2 template, and spatial smoothing with a 6mm full-width at half-max (FWHM) Gaussian kernel.
Trial-wise ROI analyses
Preprocessed data were submitted to linear mixed-effects analyses, using a two-step procedure. First, we generated BOLD signal change estimates for each trial using a first-level general linear model (GLM) in SPM. This GLM separately modeled stick functions with onsets at the Evaluation and Selection period of each trial. Trials were concatenated across the two task blocks, and additional regressors were included to model within-block means and linear trends. Finally, the GLM was estimated using a reweighted least squares approach (RobustWLS Toolbox; Diedrichsen and Shadmehr, 2005) in order to minimize the influence of outlier time-points (e.g., due to head motion). After estimating this first-level GLM, we extracted beta estimates for each trial from our primary regions of interest (ROIs; see below), transformed these beta estimates with the hyperbolic arcsine function (to achieve normality), and then analyzed trial-to-trial variability in these BOLD estimates with linear mixed-effects regressions (using Matlab’s lmefit function). Fixed effect degrees of freedom were estimated using Satterthwaite approximation. Projected values (Ŷ) and standard errors shown in Figure 2 were generated using Matlab’s predict function, based on the relevant mixed-effects regressions.
Whole-brain analyses
We supplemented the ROI analyses above with whole-brain GLMs. For these analyses, first-level GLMs again separately modeled events occurring at Evaluation and Selection periods of each trial. The Evaluation period was modeled as a single event, modulated by a parameter of interest, including (log) evaluation time, task condition (indicator variable for Choose vs. Like), set liking, and decision certainty. Parametric regressors were not orthogonalized with respect to one another, allowing them to compete for variance independently. Missed trials (failures to choose a response within 5s in the Selection period) occurred rarely (0.4% of trials) and were modeled as a separate condition. As above, trials were concatenated and appropriate regressors were included to account for block-wise effects, and the GLMs were estimated with RobustWLS. We performed second-level random-effects analyses on the beta estimates generated at the first level, and whole-brain group statistical maps were generated using one-sample t-tests over these contrasts. These maps were generated with a voxelwise p<0.005 and extent-thresholded to achieve a whole-brain family-wise error cluster-corrected p<0.05. These maps were projected onto the Caret-inflated cortical surface (Van Essen, 2005).
Regions of interest
In order to examine activity in the rACC and mOFC Networks (Figure 2A), we generated ROIs based on whole-brain statistical maps from two fMRI experiments reported in Shenhav & Buckner (2014). The rACC Network mask was defined based on regions in which activity had been correlated with positive affect experienced when viewing one’s options (rACC, ventral striatum, and PCC). The mOFC Network was defined based on regions that had shown greater activity for two high-value choices versus choices between a high-value option and a low-value option (mOFC, RSC, left MFG). This contrast compared choice sets that were matched for the best outcome but differed in the difficulty of comparison. For each of these two contrasts, we generated a mask consisting of a conjunction of the network regions that were consistently active across both previous fMRI studies (based on a voxelwise threshold of p<0.001 and cluster extent threshold of 50 voxels within each study). We excluded from each mask any voxels that intersected the rACC and mOFC networks, as well as any voxels that were part of a third network that tracked choice anxiety across these earlier studies (consisting primarily of dorsal ACC and anterior insula). Orthogonal analyses of this anxiety-related network are reported elsewhere (Shenhav et al., submitted).
We followed up these functionally-defined network analyses with analyses that targeted anatomically-defined ROIs for rACC (Area 24) and mOFC (Area 14) based on a probabilistic anatomic atlas by Córcoles-Parada and colleagues (2017) (Figure 4A). These ROIs were generated by thresholding the atlas such that each ROI only contained voxels with 50% or greater probability being classified as part of the given anatomical region.
Results
Behavioral Results
Our first goal was to understand how the appraisal process incorporated individual item values, so that we could better interpret its relationship to choice. In particular, preference for the set could be driven by the value of the best item in the set, the value of the average item in the set, or some combination of the two. When regressing set liking on both of these variables, we found that it was most strongly associated with the average option value (β = 0.59, t(26.0) = 10.1, p < 0.001). The value of the best item in the set exerted a non-significant positive influence on set liking (t(25.6) = 1.7, p = 0.10).
We next examined the factors that contributed to faster decisions in the Like or Choose Evaluation period (collapsing across trials completed inside and outside the scanner). These response times offer a potential proxy for the strength of decision evidence provided on a given trial, and therefore the certainty with which that decision was made. In the Like condition, evaluation RTs demonstrated a strong inverse U-shaped relationship with average set value, such that evaluations were fastest for the highest and lowest valued sets (linear effect: β = -0.09, t(24.6) = -3.4, p = 0.003, quadratic effect: β = -0.11, t(25.3) = -3.9, p < 0.001). This distinct pattern is consistent with previous findings of increased certainty when responding at the extreme of a scale (Lebreton et al., 2015). Accordingly, Like RTs were significantly faster when participants indicated either the least or most liking for a set (β = -0.21, t(24.6) = -3.6, p = 0.001). In the Choose condition, evaluation times were best predicted by the absolute difference between the chosen value and the average of the remaining values (β = -0.25, t(29.5) = -9.4, p < 0.001), consistent with previous findings (Krajbich and Rangel, 2011; Solway and Botvinick, 2015). Unlike the quadratic effect observed in the Like condition, Choose evaluations only demonstrated a linear influence of set value (faster with more valuable choice sets; β = -0.09, t(25.8) = -4.6, p < 0.001), over and above the primary effect of value difference on these evaluations (Hunt et al., 2012; Shenhav and Buckner, 2014). Each of our two tasks thus offered independent indices of decision certainty via RT: for Choose trials, this was value difference, and for Like trials this was rating extremity.
Importantly, RTs in the Selection period (at the end of each trial) were not significantly affected by set value or extreme set values (|ts| < 1.85, ps > 0.05) and were only weakly affected by value difference (Choose trials: β = -0.04, t(93.8) = -2.1, p = 0.038). This suggests that participants were performing the task as intended, primarily making their decisions during the Evaluation period and before moving into Selection. Relatedly, during set appraisal (Like trials), Liking responses were not significantly associated with Choose-related variables, such as the relative value of the chosen item or the time taken for Choose evaluations (|ts| < 1.95, ps > 0.05). In addition to suggesting that the appraisal process was indeed taking place during the Evaluation period, this further suggests that appraisal was less likely to be significantly “contaminated” by choice or a prospective selection process between the options. When comparing the two tasks, we also found that evaluations were slower overall on Choose relative to Like trials (β = 0.24, t(26.0) = 3.6, p = 0.001); thus we include evaluation time as a covariate in all GLMs where these conditions are compared.
fMRI Results
We focused our neuroimaging analyses on two networks: an rACC Network (including rACC, PCC, and ventral striatum) previously associated with positive feelings towards a choice set and an mOFC Network (including mOFC, RSC, and left MFG) previously associated with difficult choices (see Figure 2A). Each network ROI was defined independently based on activation patterns in Shenhav & Buckner (2014).
Task-independent encoding of set liking
We had two key predictions regarding the rACC Network. First, we predicted that it would track how much participants like the choice set. Consistent with this, we found that activity in the rACC Network indeed increased parametrically with ratings of set liking during the Like task (β = 0.09, t(25.6) = 3.3, p = 0.003). Second, we predicted that this network would track set liking irrespective of the task being performed at the time (i.e., whether the participant was appraising the set or comparing the products with one another). Since participants gave both Like and Choose responses for each choice set (across Phases 2 and 3), we were able to test for correlates of set liking on Choose trials as well. As in the Like trials, activity in the rACC Network increased parametrically with ratings of set liking when participants were evaluating that set for the Choose task (β = 0.08, t(26.5) = 2.5, p = 0.020) (Fig. 2B, left).
Overall, the rACC Network tracked set liking across all trials (β = 0.09, t(26.2) = 3.3, p = 0.003), with no additional interaction between liking ratings and task condition (Choose vs. Like; t(25.9) = -0.5, p=0.61). The model demonstrating this effect of set liking controlled for evaluation time, which did not significantly correlate with activity in the rACC Network (t(27.3) = -0.8, p=0.43). Set liking also continued to be a significant predictor of rACC Network activity (β = 0.10, t(26.2) = 4.1, p<0.001) in a model that included set liking along with the set value and the value of the chosen item; on the other hand, set value and chosen value were not significant predictors of rACC Network activity in this model (|ts| < 1.85, ps > 0.05).
Differentiation of choice versus appraisal
We predicted that mOFC would be more active when engaged in choice comparison than when appraising a set (“browsing”), even though both tasks require participants to evaluate each of the items. Consistent with this, we found that mOFC Network activity was substantially greater for Choose relative to Like trials (β = 0.36, t(27.2) = 7.1, p<0.001; Fig. 2B, right).
Activity in the rACC Network also showed a reliable effect of task condition (β = 0.11, t(166.4) = 3.1, p = 0.003). However, we observed a network by task interaction demonstrating that the difference in activity for Choose versus Like was significantly greater in the mOFC Network than the rACC Network (F(1,65.7) = 25.0, p<0.001). For completeness, Figure 3A shows that the spatial distinctions between the mOFC and rACC Networks are reproducible within our own data, using a whole-brain analysis of set liking and task condition.
Overlapping task-dependent representations of decision certainty
Our behavioral results provided evidence for dissociable sources of decision certainty between the Choose and Like conditions. More certain decisions tend to be faster than less certain decisions (Festinger, 1943; Pleskac and Busemeyer, 2010; Kiani et al., 2014), and we found that Choose evaluations were fastest when relative chosen value was high (cf. Krajbich and Rangel, 2011; Hunt et al., 2012; Rangel and Clithero, 2015) and Like evaluations were fastest when the Like rating was at one of the extremes of the Likert scale (cf. Lebreton et al., 2015). Our task therefore offered a unique opportunity to directly compare these two forms of decision certainty within the same study.
During the Choose task, rACC Network activity increased with our estimate of choice certainty (the difference between the value of the chosen item and the average value of the remaining items; β = 0.10, t(38.1) = 3.5, p = 0.001). During the Like task, activity in this network increased with our estimate of appraisal certainty (whether the appraisal rating fell at one of the two extremes of set liking; β = 0.15, t(39.7) = 2.6, p = 0.012). Each of these regression models controlled for evaluation time and set liking. Thus rACC Network activity tracked both types of decision certainty. This is consistent with separate lines of research demonstrating that regions of the rACC Network track appraisal certainty when rating the pleasantness of a single item (Lebreton et al., 2015; De Martino et al., 2017) and they track choice certainty when selecting among multiple potential rewards (e.g., FitzGerald et al., 2009; Lim et al., 2011; Hunt et al., 2012; De Martino et al., 2013; Shenhav et al., 2016);(see also White et al., 2014).
Notably, in contrast to the task-independent signals of set liking observed in the rACC Network, we found that decision certainty signals in this network were task-dependent (Figure 3B). rACC Network activity was not positively correlated with value difference in the Like condition (β = - 0.04, t(221.9) = -1.6, p = 0.10), nor with extremity of set liking in the Choose condition (β =- 0.002, t(24.8) = -0.03, p = 0.97). Thus, neural correlates of Choose/Like certainty were differentiable based on the nature of the evaluation, likely reflecting post-decision signals. These task-specific signals of certainty (both behavioral and neural) also provide additional evidence that participants were not simply engaging in both types of evaluations on each trial.
Dissociable roles for rACC and mOFC
The ROI results above suggest that the rACC and mOFC Networks are, to different degrees, both sensitive to task condition (Like vs. Choose). These analyses also showed that activity in the mOFC Network tracked set liking across conditions (β = 0.07, t(30.2) = 2.6, p = 0.014), just as in the rACC Network. However, because these networks were defined functionally (based on patterns of activity within previous studies), they lack anatomical specificity. We therefore performed a final set of analyses within ROIs defined based on anatomical boundaries specific to rACC (Area 24) and mOFC (Area 14) (Córcoles-Parada et al., 2017). Similar to the findings above, these analyses showed that both regions were separately associated with each of the three variables of interest (rACC: tcondition(58.7) = 4.2, p<0.001, tliking(32.1) = 3.2, p<0.005, tcertainty(38.9) = 4.0, p<0.001; mOFC; tcondition.(81.8) = 5.2, p<0.001, tliking (26.5) = 2.0, p<0.06, tcertainty (155.6) = 2.3, p<0.05). However, when including both regions within the same regression we find that these functional associations dissociate, with mOFC sensitive to task condition but not liking or certainty, and rACC demonstrating the opposite profile (Figure 4. Table 1).
Discussion
People can appraise potential rewards with or without the intent to choose between them. To better understand the mechanisms underlying these different but overlapping processes, we directly contrasted appraisal and choice and uncovered two key findings. First, medial OFC and associated regions (RSC and MFG) were more active when participants compared options to make a choice, rather than when they appraised the overall value of the choice set. Second, rostral ACC and associated regions (PCC and VS) tracked how much participants liked those options, irrespective of whether they were tasked with reporting set liking on a given trial. The rACC Network also tracked the individual’s certainty in the evaluation they were making on that trial. Together these results provide valuable insight into the mechanisms underlying different forms (or components) of reward evaluation.
A substantial body of work has reported value signals across regions of the rACC and mOFC Networks, both when the task requires participants to make a choice (Grabenhorst and Rolls, 2011; Levy and Glimcher, 2012; Bartra et al., 2013; Clithero and Rangel, 2013) and when it does not (Lebreton et al., 2009; Tusche et al., 2010; Levy et al., 2011; Grueschow et al., 2015). A common interpretation of these findings is that the value signals being uncovered in the two cases reflect a common valuation circuit engaged in an implicit decision process, irrespective of the task being performed. However, our findings speak to an alternative interpretation (Grabenhorst and Rolls, 2011; Shenhav and Buckner, 2014), which proposes that these signals reflect two different valuation processes: one involving the triggering of stored affective (cf. Pavlovian) associations between stimuli and potential outcomes, the other involving a direct comparison between those outcomes, potentially via mutual inhibition (Hunt et al., 2012; Padoa-Schioppa, 2013; Strait et al., 2014).
In line with the latter proposal, our results suggest that rACC and connected regions may signal reflexive affective associations while mOFC and connected regions may be more directly involved in active choice comparison. We did find evidence that both networks were sensitive to set liking and task condition, consistent with the fact that these networks included adjacent sets of voxels. However, when specifically examining rACC and mOFC as simultaneous predictors, we find that they demonstrate dissociable roles in signaling set liking versus choice comparison. These findings leave open the possibility that both networks carry signals related to task condition and set liking, and/or that they reflect different stages of evaluation. Future research using measures with higher temporal resolution could shed light on this issue by examining the relative timing of these neural signals during appraisal and choice.
While only one of our conditions required participants to compare their options, both conditions required participants to select a response, whether it was selecting one of the four items or selecting their liking rating. Regions of the rACC and mOFC Networks have been shown to track variables related to choice selection, for instance the relative value of the chosen versus the unchosen item(s) (Rushworth et al., 2011; Rangel and Clithero, 2015). It has been proposed that these value difference signals may reflect the decision output itself (Hunt et al., 2012; Hunt and Hayden, 2017), and/or that they reflect a metacognitive signal related to the confidence or ease with which the decision was made (Shenhav et al., 2016). Separate research has shown that some of these same regions track similar metacognitive signals when rating one’s liking of an individual item (Lebreton et al., 2015) – these studies show that rACC tracks one’s confidence in that rating, indexed by how close the rating was to the endpoint of the scale (see also Guggenmos et al., 2016; De Martino et al., 2017). We replicated both of these findings in the current study, showing that the rACC Network tracked value difference signals during Choose trials and the extremity of ratings during Like trials. Importantly, because participants did not have information about the actions required to submit their choice when they were evaluating their options, neither these findings nor the findings above can be attributed to valuation/selection of specific motor effectors.
Our certainty-related results are notable both for the dissociation and integration they reveal. For example, one might ask whether participants automatically engage in both appraisal and choice during the evaluation period, irrespective of task instruction, and then “gate” the relevant response during the selection period. If that were true, we would expect to see both Like- and Choose-related decision certainty signaled on each trial. Instead, we found that certainty signals in the rACC Network only reflected the relevant task and not the irrelevant task (i.e., liking certainty only on Like trials, choice certainty only on Choose trials), suggesting that participants were only generating a response for a single task on each trial. This dissociation is particularly striking when juxtaposed with our finding that these regions tracked set liking in a task-independent manner, on both Like and Choose trials.
At the same time, the fact that liking and certainty signals converge in rACC is also noteworthy (cf. Lebreton et al., 2015; De Martino et al., 2017). There are at least two intriguing explanations for these convergent signals. One explanation follows from certain models of choice comparison, which predict that a region involved in such comparison should encode the overall value of a choice set (the key predictor of set liking in our study) and the relative value of the chosen item (the key predictor of certainty for our Choose trials) (Hunt et al., 2012; Hunt and Behrens, 2014). However, these accounts have only been applied to choice (not appraisal), and have typically been applied to mOFC (not rACC) because mOFC is believed to be more involved in choice comparison. Consistent with this general account, we found that mOFC activity was greater for Choose than Like trials, but the choice-relevant values were instead primarily tracked by rACC. An alternate explanation for these convergent liking and certainty signals in rACC is that both of them reflect forms of affective appraisal rather than an explicit component of a decision process. On this account, the rACC network reflects one’s affective state, which is increasingly positive when viewing good options (irrespective of the task) and when more confident in one’s decision.
Our findings have potential implications for research into reward-related impulsivity and its relationship to other forms of valuation. They suggest that the systems that signal the availability of reward (and potentially propel approach behavior towards those rewards) are at least partially dissociable from the circuits involved in selecting among those rewards. This offers a potential mechanism for divergent phenomenology when “browsing” or facing a store window versus when “buying” or making choices inside the store. In addition, the patterns of findings we observed in the rACC Network have intriguing parallels in research on reward reactivity in impulse control disorders (e.g., to food and drug cues; Volkow and Baler, 2015; Boswell and Kober, 2016), and are consistent with the possibility that these rewards can drive reward-seeking behavior independently of goal-directed comparison (van der Meer et al., 2012; Vandaele and Janak, 2017). Additional research on this topic and its expression in real world contexts could thus benefit by extending to approach-related behavior linking appraisal and choice (e.g., the act of entering the store).
Finally, by establishing a direct comparison between the processes involved in appraisal and choice, the current study can be combined with other findings to offer valuable insight into the processes that drive us to increase or decrease the size of our option sets. Indeed, this work suggests a tension between networks that support opposing preferences. While activity in the rACC Network and associated positive feelings may scale with the rewards on offer, separate networks appear to be involved in managing the decision process and in signaling the subjective costs (e.g., anxiety) of overcoming conflict between salient options (Shenhav and Buckner, 2014; Shenhav et al., submitted). As a result, whether one is appraising a set or choosing from can affect how demanding or aversive their evaluation will be (Shenhav et al., submitted). Improving our understanding of the dissociations between these circuits may therefore hold promise for reducing the costliness of transitioning from being a browser to being a chooser.
Acknowledgments
The authors are grateful to Elizabeth Beam, Marina Burke, Erin Guty, Emily Levin and Erik Nook for assistance in data collection and Randy Buckner for helpful discussions.