ABSTRACT
Neurons in the lateral intraparietal area (LIP) exhibit both sensory and oculomotor preparatory responses. During perceptual decision making, the preparatory responses have been shown to track the state of the evolving evidence leading to the decision. The sensory responses are known to reflect categorical properties of visual stimuli, but it is not known if these responses also track evolving evidence. We compared sensory and oculomotor-preparatory responses in the same neurons during a direction discrimination task when either the discriminandum (random dot motion) or an eye movement choice-target was in the neuron’s response field. Both configurations elicited task related activity, but only the motor preparatory responses reflected evidence accumulation. The results are consistent with the proposal that evolving decision processes are supported by persistent neural activity in the service of actions or intentions, as opposed to high order representations of stimulus properties.
INTRODUCTION
The life of animals is a constant process of deciding what to do next based on, among other things, the perception of the world around them. In primates, perceptual decision making has evolved into an efficient mechanism of translating the perceived state of the world into possible motor actions (Cisek and Kalaska, 2005; Klaes et al., 2011; Kubanek and Snyder, 2015). The motor system receives continuous access to evolving perceptual decisions and maintains a graded level of preparedness based on the quality of the incoming evidence (Gold and Shadlen, 2000; Selen et al., 2012). This sensorimotor transformation is particularly evident in the parietal and prefrontal association cortices, where neurons encoding the motor actions associated with the choices on offer also represent evolving decisions (Kim and Shadlen, 1999; Roitman and Shadlen, 2002; Bollimunta and Ditterich, 2011; Ding and Gold, 2012; de Lafuente et al., 2015). Thus, perceptual decision making can be framed as a choice between available motor actions (Cisek, 2007; Shadlen et al., 2008; Cisek and Kalaska, 2010).
Yet, perceptual decisions do not feel like they are about potential actions but about propositions or stimulus properties. Indeed, one can make a decision without knowledge of the action that will be required to act on it. In such situations, one might expect neural circuits involved in motor planning to be irrelevant to the decision process (Gold and Shadlen, 2003). However, it has been shown that even then, neurons in the parietal association areas carry a representation of the properties of the stimulus that will be relevant for future actions (Freedman and Assad, 2006; Bennur and Gold, 2011; Goodwin et al., 2012). It is possible that such an ‘abstract’ representations of decision relevant information—independent of the possible motor actions—coexist with representations of decisions as intended actions (Freedman and Assad, 2011). Whether such simultaneous representations exist in the same association area has not been investigated before. Consequently, it is also not known if such abstract representations play a role in the decision-making process.
We used the random-dot motion (RDM) direction discrimination task (Newsome et al., 1989) to investigate these questions. In this task, the animals discern the net direction of a stochastic motion stimulus and report their decision by making a saccade to one of two choice targets that is along the direction of the perceived motion. This task is particularly well suited for our purposes. First, optimal performance on this task demands integration of motion evidence over time. This prolonged deliberation time allows characterization of whether a neural population is participating in the process of evidence accumulation or not. Second, there exists a theoretical framework—bounded accumulation of noisy evidence to a decision threshold (aka drift-diffusion, Smith and Ratcliff, 2004; Palmer et al., 2005)— that accounts quantitatively for the speed and accuracy of decisions in this task. Third, it has been shown that responses of neurons in several areas of the brain involved in planning saccadic eye movements represent the evolving decision in this task (Shadlen and Newsome, 1996; Horwitz and Newsome, 1999; Kim and Shadlen, 1999; Ding and Gold, 2010, 2012).
We focused on the parietal sensorimotor association area LIP. Many neurons in LIP respond to both the presence of a sensory stimulus in, and to a planned saccade into their response fields (Barash et al., 1991b). We recorded the responses of the same set of neurons during the RDM discrimination task in two configurations — when the RFs contained the RDM stimulus and when they contained one of the choice targets. We show that the neurons represent the moment-by-moment accumulation of sensory evidence only in the latter configuration, that is, when they are involved in the planning of the motor action required to report the choice.
RESULTS
We recorded from 49 well isolated single neurons in area LIP from two monkeys (28 neurons from monkey N and 21 neurons from monkey B) as they decided the net direction of a noisy random-dot motion (RDM) stimulus. On each trial, two choice targets indicated the two directions to be discriminated (e.g., up vs. down). The monkeys reported their decision by making a saccade to the choice target along the perceived direction of motion. They were free to indicate their decision whenever ready, thus providing a measure of reaction time (RT). The monkeys performed the task with the RDM and the targets arranged in two configurations (Figure 1). In the ‘Target-in-RF’ configuration, one of the choice targets was placed in the response field (RF) of the neuron under study. In the ‘RDM-in-RF’ configuration, the RDM was placed in the RF. In this way, we obtained data from the same LIP neuron when it belonged either to the pool representing the RDM stimulus or to one of the two pools representing the choice targets.
We first establish that the animals integrate motion information over 100s of ms to make their choices in both task configurations. This prolonged deliberation time offers a window in which to interrogate how the neural responses relate to the process of decision formation. We show that the firing rates of neurons represent the state of the accumulated evidence only when the neurons belong to a pool representing the targets.
Behavior in the two task configurations
The behavior of both monkeys exhibited an orderly dependence on the strength of the RDM in both task configurations. They took longer to report their decision when the motion strength was weaker (Figure 2, A-D), and their decisions were less accurate (Figure 2, E-H). The systematic relationship between reaction time (RT) and accuracy is well described by the accumulation of noisy evidence to a threshold, which determines both the time it takes to make a decision and which alternative the monkey chooses (Gold and Shadlen, 2002; Smith and Ratcliff, 2004). We support this assertion by fitting the RTs to a bounded evidence accumulation model and then using the fitted parameters to predict the choices (Shadlen and Kiani, 2013; Kang et al., 2017). Specifically, the curves in the top row of Figure 2 are fits to a parsimonious symmetrically bounded drift diffusion model, which uses four parameters to account for the effect of motion strength on the mean RT for correct choices (Equation 1; see Methods). Two of the parameters—the bound height, ±B, and the sensitivity coefficient, κ, that multiplies motion strength to establish the drift rate—establish predictions for the proportion of choices as a function of motion strength. The dashed curves in the lower panels of Figure 2 depict these predictions. They are in excellent agreement with the data (e.g., the deviance of the predictions exceeds the deviance of a logistic fit to the data by 25.1±7.1; range 10 to 41.1). It is not the case that any monotonic ordering of RTs by motion strength would do as well. For example, random perturbations of the mean RTs which preserve their orderly dependence on motion strength yielded significantly worse choice predictions (Supp. Fig. S1). The predictions in all configurations were more similar to the accuracy function obtained by logistic regression (red lines in Supp. Fig. S1) than the predictions after perturbation (p<10−10 in all four configurations). The fidelity of the predictions offers strong support for the assertion that the choices result from the same process of bounded evidence accumulation that explains the decision times. Importantly, this conclusion holds for both stimulus configurations.
From this exercise we conclude that the decision times (i.e., RT minus the non-decision time) estimated from diffusion model fits can be used to identify an epoch in which noisy evidence was integrated to make the decision. To obtain more refined estimates of the integration times for the different task configurations, we fit a more elaborate bounded diffusion model (Supp. Fig. S2, see Methods for details and Table 1 for fit parameters). The small differences in reaction times between the two configurations for Monkey N was accounted for by the non-decision time parameter. For Monkey B, a combination of increased sensitivity and decreased bound height contributed to the faster RTs in the RDM-in-RF configuration. Importantly, the fits established that both monkeys integrated evidence over hundreds of ms in each configuration (mean±SD of decision times at 0% motion coherence: 0.41±0.2 s in both configurations for Monkey N, 0.24±0.1 s and 0.20±0.1 s for Monkey B in the Target-in-RF and RDM-in-RF configurations respectively). Furthermore, the decision times did not differ significantly between the task configurations in either monkey at any of the coherences (p>0.3 at all stimulus strengths, t-test).
LIP neuronal responses in the two task configurations
Neurons in area LIP can exhibit sensory-, memory- and saccade-related responses (Gnadt and Andersen, 1988; Barash et al., 1991a). For example, in a task where a monkey has to remember a visually cued location and make a delayed saccade to it, LIP neurons can show (1) a short latency response to the visual cue if it appears in the RF, (2) a persistently elevated response during the delay period and (3) a burst of activity preceding a saccade to the remembered location. Not all neurons exhibit all three types of responses. Since our goal was to compare the decision related activity in the same neurons when they belonged to the pool representing the sensory information and when they belonged to the pool involved in planning the motor action, we recorded from neurons that responded to visual stimuli in their RF and also showed persistent activity in association with saccadic motor planning.
During the direction discrimination epoch, the pattern of activity of the recorded neurons varied according to which pool they belonged to. When the neurons belonged to a pool with one of the targets in the RF, the responses largely recapitulated observations from earlier reports (e.g. Roitman and Shadlen, 2002; Churchland et al., 2008). Figure 3 shows the average population response of all neurons in the Target-in-RF configuration, aligned to either the onset of RDM (Figure 3A) or to the saccade (Figure 3B). The response is elevated before the onset of the RDM reflecting the presence of a choice target in the RF of the neurons. Following motion onset, there is a stereotyped dip in activity before the responses begin to separate by motion strength. The evolution, beginning ~180 ms after stimulus onset, is best appreciated in the de-trended responses (Figure 3A, inset). These features and those next described were evident in both of the monkeys, shown individually in Supp. Fig. S3 and Supp. Fig. S4.
The same neurons also exhibited differential responses to the two directions of motion being discriminated when they belonged to the pool representing the RDM. To combine responses across the population in this task configuration, we identified the preferred direction of motion for each neuron as the one that elicited the greater response at the highest motion strength (51.2% coherence). Figure 3C-D shows the responses of the population averaged after sorting by each neuron’s preferred direction. After an initial rise in activity due to the appearance of the RDM in the RF, the responses exhibited a coherence dependent separation starting ~180 ms after stimulus onset. The coherence dependent rise is evident in the de-trended responses (Figure 3C, inset) albeit with a smaller dynamic range compared to responses in the Target-in-RF configuration. Note that in this configuration, directions are sorted based on a post-hoc criterion — preferred direction of each neuron at the highest motion strength. The coherence dependent ordering of responses could have been accentuated by this selection bias.
The responses of LIP neurons in the RDM-in-RF configuration bear similarity to direction selectivity reported in naïve monkeys (Fanini and Assad, 2009) and category selectivity reported in monkeys trained to categorize sets of motion directions (Freedman and Assad, 2006). For comparison with these studies, we quantified the time course of the evolution of direction selectivity at the highest motion strength (Figure 3E) using an ROC metric (see Methods). The responses to the two motion directions were significantly different starting 190 ms after the onset of dot stimulus (p<0.05 on Wilcoxon rank sum test). This is much later than the ~50 ms latency of direction selectivity observed in naïve monkeys (Fanini and Assad, 2009) and ~100 ms latency for direction category selectivity (Swaminathan and Freedman, 2012). As discussed below, this may be an indication that the directional responses we observed in the RDM-in-RF configuration arise through a different mechanism than the direction- and category-selective responses previously reported in LIP.
The latency in the RDM-in-RF configuration lagged the direction selectivity seen in the same neurons in the Target-in-RF configuration (180 ms, p<10−3, bootstrap analysis). However, the similarity of the latencies suggests that the RDM-in-RF population might also reflect the formation of the decision, as the Target-in-RF population has been shown to do (Roitman and Shadlen, 2002; Churchland et al., 2008). To test this, we asked if the rise or decline of neural activity depended on both the direction and strength of the RDM. We quantified this by estimating the slope of the responses (buildup rate) in a 200 ms epoch beginning at the time of response separation, identified in the preceding analysis. We characterized the relationship between motion strength and buildup rates separately for the preferred and non-preferred directions of motion (Figure 3F). The buildup rates of neurons in the Target-in-RF configuration showed a linear dependence on motion strength both when the motion direction was towards the RF (1.5±0.2 spikes per s2 per 1% coherence, p<10−9) and when the motion was away from the RF (-1.2±0.2, p<10−5). A similar trend was observed in the RDM-in-RF configuration. However, this relationship was significant only for the non-preferred direction of motion (0.7±0.2 spikes per s2 per 1% coherence, p<0.002). For the preferred direction, the build-up rates increased with coherence but not significantly so (0.6±0.4 spikes per s2 per 1% coherence, p=0.13). In both configurations, these trends were preserved even when the highest motion strength trials were excluded. Thus, neuronal pools in LIP representing the saccade targets and the RDM both differentiate the discriminanda during an epoch coinciding with decision formation. The build-up of neural activity depended on the strength of the stimulus in both populations, but this dependence was weaker when the RDM was in the RF.
We also compared the responses at the end of the decision process for the two task configurations (Figure 3B & D). When the monkey chose the target in the neuron’s RF, the responses appear to coalesce to a common firing rate just before the saccade, irrespective of motion strength (Figure 3B, solid curves), as shown previously (Roitman and Shadlen, 2002; Churchland et al., 2008). This pattern is thought to reflect a threshold level detected by another circuit to terminate the decision (Hanes and Schall, 1996; Mazurek et al., 2003; Hanks et al., 2014). When the same neurons contained the RDM in their RF, the responses to the different coherences remained separated until the saccade, irrespective of whether the animal chose the target consistent with the preferred direction or not (Figure 3D). This was also the case when the RF contained the unchosen target (Figure 3B, dashed curves). Thus, only the responses of the pool representing the target chosen by the animal contains a possible neural signature of decision termination. In the ensuing sections, we support this qualitative observation with other lines of evidence that show that this pool alone signals decision termination and the time taken to reach it.
Correlation between neural responses and behavior
We examined whether the neural responses in the two stimulus configurations were predictive of the monkey’s decisions. Specifically, we asked if the trial to trial variation in the responses correlates with the trial to trial variation in the monkey’s choice behavior. To test this for each neuron, we counted the spikes in a 200 ms long epoch ending 100 ms before saccade initiation on each trial and incorporated this count in a logistic regression model of choice (GLM; see Methods). To facilitate comparison across the two stimulus configurations, we standardized the responses across trials of each configuration. We included the strength and direction of the presented stimulus as confounders, thus asking whether the variation in neural response tells us more about the upcoming choice than can be ascertained from the stimulus itself. This was indeed the case for 61.2% of cells in the Target-in-RF configuration and for 35.4% of cells in the RDM-in-RF configuration (30 of 49 and 17 of 48 cells respectively; Eq. 4, H0: β2 = 0; p<0.05; Figure 4A). The leverage of the neural activity on choice was significantly stronger in the Target-in-RF configuration (p=0.005, signed rank test).
In a complementary analysis, we assessed whether the neural responses on ambiguous trials (0% motion coherence) differed according to the eventual choice of the animal. We computed choice probability (Britten et al., 1996), a nonparametric statistic that quantifies the overlap between the distributions of responses of the neuron accompanying the two choices (see Methods). A choice probability of 0.5 indicates that the two distributions are completely overlapping and therefore uninformative about the ensuing choice. At the single neuron level, choice probability of 32.4% and 25.8% of the neurons was significantly different from 0.5 in the Target-in-RF and RDM-in-RF configurations, respectively (12 of 37 and 8 of 31 cells with at least 10 trials at 0% coherence respectively, p<0.05 on permutation test). In both stimulus configurations, the mean choice probability of the neuronal population was significantly greater than 0.5 (Figure 4B, population mean ± SEM of 0.66±0.03 and 0.59±0.04 for Target-in-RF and RDM-in-RF respectively, p<10−5 and p<0.02 on t-test). For comparison between the two configurations, we calculated ‘grand’ choice probability from standardized responses of all neurons on the 0% coherence trials (see Methods, Britten et al., 1996). This choice probability was significantly stronger in the Target-in-RF configuration (0.65 vs. 0.56, p<10−3, permutation test). From the analyses of choice probability and firing rate leverage on choice (Figure 4A-B) we adduce that LIP neurons responsive to both the RDM and the choice targets are informative about the choice, but it is the latter set of neurons (Target-in-RF) that covary more strongly with choice.
Finally, since the neurons exhibit time dependent changes in their activity in both stimulus configurations, we asked whether the variation of the buildup rates were predictive of the variation in the RTs on a trial-by-trial basis. We used the trials in which the monkey chose the target in the RF or the target consistent with the direction of motion preferred by the neuron (RDM-in-RF). For a majority of neurons recorded in the Target-in-RF configuration (36 of 49), the reaction times were inversely correlated with the slope of the neural responses (population mean: -0.08, p<0.01). In the RDM-in-RF configuration, the correlation was not significantly different from 0 (mean: 0.03, p>0.33) (Figure 4C) and significantly weaker than the correlations seen in the Target-in-RF configuration (p<0.01, Kolmogorov-Smirnov test). This comparison suggests that only the pool of neurons that contain the chosen target in their RF carries information about the time the animal will take to report its decision.
Signatures of noisy evidence accumulation in the response variance
We also wished to ascertain whether the responses on single trials conform to the expectations of noisy evidence accumulation. If they do so, the variance of the firing rates across trials should increase linearly as a function of time (i.e., the number of samples accumulated). Also, the autocorrelation between firing rates at different times within a trial should conform to the pattern associated with the cumulative sum of random numbers. This correlation should decay as a function of separation in time from the first sample and increase for equidistant samples as a function of time from the onset of accumulation (see Methods). We used the method developed by de Lafuente et al. (2015) (based on Churchland et al. (2011)) to estimate these quantities.
The variance and autocorrelation patterns varied markedly based on whether the neurons contained the target or the RDM in their RF. In the Target-in-RF configuration, the variance increased linearly with time during the same epoch that the mean firing rates seemed to reflect the integration of evidence (Figure 5A, shaded region). In the RDM-in-RF configuration, the rise in variance was significantly weaker (p<10−10, bootstrap analysis). Also, the observed autocorrelation matrix for the responses in the Target-in-RF configuration (Figure 6B-D) resembled the theoretical prediction (R2 = 0.88). In contrast, the pattern of autocorrelations (Figure 6E-G) for the responses in the RDM-in-RF configuration diverged markedly from the predicted pattern (R2 = 0.2). A bootstrap analysis confirmed that the difference in R2 values between the two configurations was statistically reliable (p<10−10; see Methods).
The variance of the neural response also affords a more refined examination of the mechanism of decision termination. The firing rate averages in Figure 3B suggest the possibility that decisions terminate when the firing rate of the neurons with the chosen target in their RF reach a threshold. A more stringent test of a threshold is that even for the same motion strength, the variance of the neural response should approach a minimum just before the saccade. Indeed, we observed a precipitous decline in the variance in the ~100 ms preceding the saccade (Figure 5B, solid blue line). The variance in the time bin preceding the saccade was significantly lower than the variance in its prior time bin (p<0.01, bootstrap analysis). This feature is less prominent for the pool of neurons with the unchosen target (Figure 5B, dashed blue line, p = 0.07) and for the neurons with the RDM (Figure 5B, green lines, p >0.1 for both direction choices) in their RF.
Together, the analyses of time dependent variance and autocorrelation reveal that neurons in the Target-in-RF configuration exhibit firing rate patterns consistent with a process that represents the running sum of noisy samples of evidence to a criterion level. The analyses complement the observations made earlier on the mean firing rates by demonstrating conformance with the second order statistics of diffusion to a bound. These features were largely absent when the same neurons were studied in the RDM-in-RF configuration. This neural population does not appear to represent the accumulation of the noisy evidence that supports the monkey’s decisions. They reflect the direction of motion during the time course of decision formation but not the state of the accumulated evidence that can be used to terminate the decision process. We next consider a possible account of their pattern of activity.
A model of interaction between populations
How could the responses of neurons with the RDM in their RF correlate with the decision outcome without representing the process of evidence accumulation? One possibility is that the weaker decision-related signals observed in the population with the RDM in their RF are inherited from the populations that have the choice targets in their RF and are involved in the accumulation process. It has been shown that responses of LIP neurons to visual stimuli are suppressed by concurrently presented visual stimuli when they are well outside the RF (Balan et al., 2008; Churchland et al., 2008), even by as much as 50° visual angle (Falkner et al., 2010; Louie et al., 2011). An asymmetrical influence of the two Target-in-RF populations could lead to the appearance of direction selectivity and a correlation with the animal’s choices in the RDM-in-RF population. Moreover, the noise added through this additional step could explain the divergence of the variance and autocorrelation of the RDM-in-RF population from the theoretical predictions of a diffusion process. Additionally, such an extra step could account for the timing of direction selectivity in the RDM-in-RF population, which lags slightly behind that of the Target-in-RF population.
To evaluate the plausibility of this idea, we simulated the responses of three neural populations—one representing the motion stimulus and two representing the choice targets—during the motion viewing epoch (Figure 7A). In the model, the RDM-in-RF population receives direct excitation from the visual representation of the dynamic random dots, but in a manner that is not selective for direction. This response is thus similar to the classic sensory response to the appearance of a visual stimulus in the RF. The two Target-in-RF populations start off at a steady firing rate, simulating the steady state sensory response to the target already present in the RF. The responses then follow drift-diffusion dynamics starting at 180 ms, simulating evidence accumulation. The drift rate was set to be directly or inversely proportional to motion coherence for the populations representing the correct and incorrect targets, respectively (Figure 7B-C).
The three populations interact through divisive suppression (Sceniak et al., 2001; Carandini and Heeger, 2011; Louie et al., 2011) at each time point, parameterized by the ω terms in Equation 8 (Methods). We set these parameters to approximate the observed neural responses to the 25.6% motion strength RDM (illustrated in Figure 7F-G). We assumed that the early dip in the response of the Target-in-RF neurons (arrow, Figure 7F) was caused by suppression from the neurons activated by the appearance of the RDM (ωDT1=ωDT2). The suppression between the two Target-in-RF pools (ωT1T2=ωT2T1) was estimated from the onset and steady state responses after the appearance of the target in the RF. Suppression of the RDM-in-RF pool from the Target-in-RF pools (ωT1D and ωT2D) were adjusted around ωDT to achieve the separation in firing rate traces shown in Figure 7F-G (see Methods). Such asymmetric influence of the two Target-in-RF populations might arise from differences in their spatial relationship (neuronal connectivity) with the RDM-in-RF population. These adjustments were sufficient to mimic the observed mean responses of the neural population in our simulations (Figure 7D-E). Note that the direction selectivity of the RDM-in-RF population is derived solely from the suppressive inputs from the Target-in-RF populations.
This simple model reproduced the main features of our results. After the implementation of suppression, the Target-in-RF population retained the time course of the variance and the pattern of autocorrelation expected of a diffusion process. Notably, the RDM-in-RF population displayed features that resemble those seen in the neural data, namely (i) the attenuated increase in variance as a function of time and (ii) the divergence in the pattern of autocorrelation from theoretical prediction (Figure 8). Thus, the model provides a plausible account of how mean neural responses in the RDM-in-RF population can reflect the animal’s choices while the variance and autocorrelation of these responses fail to show signatures of evidence accumulation.
DISCUSSION
We analyzed the decision related activity of LIP neurons under two configurations that allowed us to compare and contrast the sensory and motor-planning responses. To do this we studied the same LIP neurons when either the RDM stimulus or one of the choice targets was in the neural response field. Based on this comparison, we conclude that the process of evidence accumulation leading to choice is revealed primarily in motor preparatory responses. The sensory responses exhibit direction preference and a weak relationship with the animal’s behavior, but our results and simulations suggest that this relationship is likely inherited from the motor preparatory responses. We first discuss our results in the context of previous studies of area LIP and then on how they bear on the broader question of routing of information in the cortex.
Properties of neural responses in area LIP
There has been a long debate about the relative importance of sensory salience-related signals and saccade preparatory signals in area LIP (Bushnell et al., 1981; Barash et al., 1991a; Colby and Goldberg, 1999; Andersen and Buneo, 2002). A large fraction of neurons show inherent selectivity for visual features such as direction and shape even in monkeys that have never been trained to use such information to perform a laboratory task (Sereno and Maunsell, 1998; Fanini and Assad, 2009). In addition, training induces stimulus selectivity that can be distinct from intrinsic selectivity (Toth and Assad, 2002; Sarma et al., 2015). LIP neurons also carry a rich representation of saccade plans. They display spatially selective persistent activity when the animal awaits making a saccade to a previously instructed, but no longer visible target (Gnadt and Andersen, 1988; Barash et al., 1991a). This persistent activity is dissociable from the sensory response evoked by the target (Mazzoni et al., 1996) and can encode other factors that bear on the saccade plan such as the probability that a saccade will be instructed (Janssen and Shadlen, 2005) and the expected reward (Platt and Glimcher, 1999; Sugrue et al., 2004). The richness of the representation of the saccade plans is particularly evident in perceptual decision-making tasks, where the neuronal activity continually tracks the current state of the evidence for choosing the target in the neuron’s RF (Mazurek et al., 2003; Bollimunta et al., 2012). This activity reflects not just the state of the accumulated evidence, but also other factors that can influence decisions such as cost of time (Churchland et al., 2008; Hanks et al., 2014), prior knowledge (Hanks et al., 2011; Rao et al., 2012), and the values of the choices on offer (Rorie et al., 2010).
By recording from the same LIP neurons when they belonged to the population representing either the RDM or a choice targets, we could directly compare the sensory- and saccade-related responses. While both populations exhibit decision-related activity, there are many important differences. Both populations modulated their activity during decision formation in accordance with the strength and direction of the RDM. However, this modulation was far more intense when a choice target was in the RF. The RDM elicited a strong response when it was in the RF, but the dependence on direction and stimulus strength was subtle. This is unlikely to be explained by saturation of the response (e.g., ceiling effects), because the same neurons attained higher firing rates before saccade onset when the target was in the RF (cf. Figure 3B and Figure 3C). There was also a clear difference in the variance and autocorrelation patterns for the two populations.
Only when the neurons contained a choice target in their RF were these patterns consistent with the predictions of noisy evidence accumulation. Finally, a neural correlate of decision termination was only apparent when a target was in the RF.
Although we have used the term “sensory” to describe the direction selective responses of neurons with the RDM in their RF, the gradual build-up of the firing rates of these neurons (Figure 3B) differed from the steady rates reported in naïve monkeys (Fanini and Assad, 2009). We suspect that the responses are not sensory in the way one would characterize the responses of neurons in visual areas MT/MST or even the visual responses of LIP neurons to transient stimuli (e.g., targets). The direction selective responses observed in the RDM-in-RF configuration were remarkably slow, emerging 190 ms after stimulus onset (at the highest coherence). This is far later than the ~50 ms latency of direction selectivity reported by Fanini and Assad (2009) and the ~100ms latency for direction-category selectivity reported in Swaminathan and Freedman (2012). It is slightly longer than the 180 ms latency of decision-related signals observed in the neuronal pool representing the targets.
Together, these considerations suggest that the neuronal pool representing the RDM inherits its direction and choice related signals from the neuronal pools representing the targets. We demonstrated that a model of lateral interactions serving the general purpose of gain control (Carandini and Heeger, 2011) is sufficient to produce these effects. Such lateral interactions are well established in upstream visual areas (Schein and Desimone, 1990; Shushruth et al., 2009; Hunter and Born, 2011) and are believed to be fundamental to many cortical circuits (Carandini and Heeger, 2011). In LIP, lateral interactions are thought to mediate the suppressive effect of visual stimuli presented outside a neuron’s RF (Balan et al., 2008; Churchland et al., 2008; Zhang et al., 2017). Such suppression can arise from stimuli presented at large distances (>50°) away from the RF, even in the opposite hemifield (Falkner et al., 2010; Louie et al., 2011). Our modeling exercise demonstrates that this property of LIP circuits can account for the apparent direction and choice correlation of the responses of neurons with the RDM in their RF. To test this model, we would need to record simultaneously from neurons that represent the RDM and at least one choice target.
A limitation of the present study is that we do not have access to two classes of neurons on the same trials. We are therefore unable to test the model proposed to account for the weaker decision related activity of neurons that contain the RDM in their RF. For example, we would predict that the weaker leverage of the RDM-in-RF neurons would be explained away (i.e., mediated) by inclusion of Target-in-RF responses in the same GLM. Our strategy to record from the same neurons under two stimulus configurations has the obvious dividend of matching the two groups. However, it might have led to undersampling neurons with RFs nearer the fovea. We cannot exclude the possibility that the sensory responses of such neurons would be more strongly coupled to decision formation.
Routing of information in cortex
We have shown that in LIP, the neurons that contain the choice targets in their RF represent the accumulation of evidence bearing on the possibility that this target will be chosen. The neurons that contain the RDM in their RF do not show such signatures of evidence accumulation. We do not know how the momentary evidence represented by populations of direction selective neurons in the visual cortex makes its way to the target-representing neurons in LIP. There are projections from areas MT and MST to area LIP, but it is difficult to reconcile this direct pathway with the long latency of the decision related activity in LIP. The delay of the decision related responses relative to the latency of the visual responses in LIP (of ~50 ms), suggests a role for some form of memory buffer and/or a multisynaptic chain through which decision relevant information must pass before reaching the saccade planning neurons in LIP. This is just one of many reasons to suspect that these apparently simple perceptual decisions may share similarities with more complex decisions that derive evidence from memory and other evaluations (Shadlen and Shohamy, 2016).
We must emphasize that area LIP is not the only region that receives decision-pertinent signals in this task. Other areas that are involved in the planning of eye movements like FEF/Area 46, the caudate nucleus and the superior colliculus also have access to such input (Horwitz and Newsome, 1999; Kim and Shadlen, 1999; Ding and Gold, 2010, 2012; Mante et al., 2013). However, the decision related activity in these areas arises with comparable latencies, so they do not furnish an explanation for the long latency in LIP, at least not readily. We favor the idea that the latency is necessitated by limitations in connectivity between the many possible sources of evidence bearing on the salience of an item (e.g., a choice target) and the neurons that represent such items as potential affordances to the motor system (e.g., neurons in LIP). This connectivity constraint might necessitate active routing (Olshausen et al., 1993; Kastner and Pinsk, 2004; Summerfield and Tsetsos, 2012), although this process is poorly understood.
Finally, our results expose the limitations of using choice correlations to implicate a neuronal population as part of the circuit that drives the corresponding behavior. The neuronal pool in LIP representing the RDM has a mean CP of 0.59, which is larger than the reported CP of 0.54 for neurons in area MT for the same task (Cohen and Newsome, 2009). But neuronal populations that do not influence behavior can still have significant choice correlations if they themselves are correlated with other populations influencing behavior (Shadlen et al., 1996; Pitkow et al., 2015). In the RDM task, the sequential sampling framework (e.g., drift-diffusion) provides a detailed mechanistic account of evidence accumulation both at the level of behavior and at the level of its instantiation in the neural responses. This enabled us to rigorously test whether a given neural population represents the computations relevant to decision-making. Only the neuronal population involved in planning of the motor action needed to report the choice reflected such computations.
If the neurons with the RDM in the RF do not represent the evolving evidence, a natural question is what do these neurons signify? One obvious possibility is that they simply represent an object that might attract the gaze as transient lights are wont to do. Another possibility is that they represent the focus of spatial attention (Colby and Goldberg, 1999). However, this focus should be initially on the RDM and then either remain stationary through the decision or gradually give way to the chosen target. This is inconsistent with the dynamics observed in our data, which look like a muted version of the decision related signals exhibited by neurons with a choice target in the RF. The same objection applies to the proposal that these neurons represent the salience of the RDM (Bisley and Goldberg, 2010). A more speculative idea is that the neurons that contain the RDM in their RF represent the object about which the decision is made. Reprising the question posed at the beginning of this paper, the decision about motion direction is a decision about the motion of the RDM. While it is formed as if in answer to the question, “How will I report the answer?,” it is still about a visual stimulus. Perhaps LIP neurons that represent the RDM confer this critical information bearing on the spatial origins of the evidence—that is, the location of the thing we are deciding about.
MATERIALS AND METHODS
Neural recordings
We recorded activity of 49 well isolated single units from area LIPv (Lewis and Van Essen, 2000) of two adult female rhesus monkeys (Macaca mulatta) trained on the random-dot motion direction discrimination task. MRI was used to localize LIPv and to target recording electrodes. Within this putative LIPv, we recorded from neurons that had spatially selective persistent activity as assessed using a memory-guided saccade task (Gnadt and Andersen, 1988). In this task, a target is flashed in the periphery while the monkey fixates on a central spot. The monkey has to remember the location of the target and execute a saccade to that location when instructed. The response field (RF) of each neuron was identified as the region of visual space that elicited the highest activity during the interval between the target flash and the eventual saccade.
All training, surgery, and experimental procedures were conducted in accordance with the National Institutes of Health Guide for Care and Use of Laboratory Animals and were approved by the University of Washington Institutional Animal Care and Use Committee.
Task
The choice-reaction time direction discrimination task is similar to previous studies (Roitman and Shadlen, 2002). The animal initiates a trial by fixating on a point (fixation point; FP) presented on an otherwise black screen. Two choice-targets then appear on the screen. After a variable delay (drawn from an exponential distribution of mean 750 ms), the random-dot motion (RDM) stimulus is displayed in an imaginary aperture (i.e., invisible borders) of 5°-9° diameter at a third location. The first three frames of the stimulus consist of white dots randomly plotted at a density of 16.7 dots • deg-2 • s-1. From the fourth frame, each dot from three frames before is replotted — either displaced in one direction along the axis connecting the two targets, or at a random location. The probability with which a dot is displaced in the direction of one of the targets determines the stimulus strength (coherence) and on each trial, this was randomly chosen from the set C = [0, 0.032, 0.064, 0.128, 0.256, 0.512]. The motion strengths and the two directions were randomly interleaved. Importantly, the monkey was allowed to view the stimulus as long as it wanted and indicate the perceived direction of motion with a saccade to the target that lay in that direction to obtain a liquid reward. Rewards were given randomly (p=0.5) for the 0% coherence motion condition.
During recording from each isolated neuron, the choice-targets and the RDM were presented in two configurations (Figure 1). In the ‘Target-in-RF’ configuration, one of the choice-targets overlay the neuronal RF. In the ‘RDM-in-RF’ configuration, the RDM stimulus was presented in the RF. The two configurations were alternated in blocks (typically between 50-150 trials per block). For 33 of the neurons, the targets and the dot stimuli were placed 120° apart on an imaginary circle (as shown in Figure 1). For the remaining 16 neurons (in one monkey), the targets and the dot stimulus were aligned linearly in both configurations. Since the directions of motion varied across sessions, we adopted the following conventions. In the Target-in-RF configuration, the direction of motion towards the target in the RF for each neuron was considered the ‘positive’ direction. In the RDM-in-RF configuration, the positive direction is assigned post hoc from the data: the direction of motion at the highest coherence that elicited the higher mean response.
Analyses of behavioral data
The accuracy and reaction times (RT) of the monkeys were fit by a bounded evidence accumulation model (Shadlen et al., 2006). In the parsimonious application of this model employed here, the instantaneous evidence about motion at each time step is assumed to arise from a normal distribution with variance Δt and mean κC, where C is the signed motion coherence and κ, a scaling parameter. This instantaneous evidence is accumulated over time and the decision process terminates when the accumulated evidence reaches one of the bounds ±B leading to the choice of one of the targets. The mean RT is the expectation of the time taken for the accumulated evidence to reach the bound plus a constant — the non-decision time tnd comprising sensory and motor delays. To account for asymmetric reaction times in some configurations, we used two different non-decision times (tnd1 and tnd2) for the two target choices. In this framework, the mean RT for the correct choices (i.e. choices consistent with drift rate, ignoring biases) is described by the following equation:
Further, the choice distributions are described by where P+ is the probability of choosing the target consistent with the ‘positive’ direction of motion.
To demonstrate that a single model accounts for both the choices and the RTs, we fit only the observed RTs as per Eq. 1 (Gold and Shadlen, 2002; Kang et al., 2017), and predicted the choice frequencies by substituting the parameters κ and B in Eq. 2 (Figure 2). We evaluated the fidelity of these predictions by comparing the predictions to a logistic regression fit of the choice data using the deviance metric (difference between the log likelihood of the fits). To demonstrate that these predictions were not a trivial result of monotonic ordering of RTs by motion strength, we compared them to predictions from 10,000 pseudorandomly generated RT vs. coherence functions which preserved the order of RTs (Supp. Fig. S1). To generate these functions, we retained the observed RTs for the minimum and maximum coherences and used ordered random values within this range for the other coherences. We then fit these functions to Eq. 1, predicted the choices with Eq. 2, and computed the log-likelihood ratio of these predictions and the original predictions. We then determined if the predictions worsened significantly as a function of the mean difference between the original and the randomly generated RTs.
To obtain a more precise estimate of decision times, we fit an elaborated version of the bounded evidence accumulation model (Supp. Fig. S2), simultaneously to both choice proportions and reaction times (including both correct and error trials). In this model, the decision bounds (B) collapse with time (t) such that where B0 is initial bound height, B1 is the rate of collapse and Bdel, the delay to onset of collapse. The non-decision time is modeled as a normal distribution with mean tnd and standard deviation tnd_sd. This model was fit by maximizing the log likelihood of the observed responses (choice and RT) on each trial. The distribution of decision times for the various coherences were obtained from these model fits. A t-test was used to determine if these distributions at each stimulus coherence differed significantly between the two task configurations.
Analyses of neural data
Population responses were computed as the average of all trials from all neurons after smoothing each trial with a 75 ms wide boxcar filter (Figure 3). The smoothing was only for visualization and all analyses were conducted on the raw spike data (1 ms resolution). To visualize the coherence dependent buildup of activity (Insets of Figure 3A-B), we detrended neuronal responses by subtracting the average responses across all coherences (separately for each task configuration).
We used responses at the two highest coherences to determine the time at which motion-direction selectivity arises in a given neural population (Figure 3C). We averaged the responses in 40 ms bins on each trial at these coherences and derived receiver operating characteristics (ROC) from these response distributions at each time bin. The area under the ROC denotes the probability of the neuron responding more to the positive direction of motion. For each time bin, we applied a Wilcoxon rank sum test and estimated the response latency as the first of three successive bins that met statistical significance (p<0.05). We used a bootstrap procedure to compare the latencies from the two task configurations. For each configuration, we resampled trials with replacement, matching the number of trials in the original data sets. We repeated this procedure 1000 times and obtained a pair of distribution of latencies from the resampled data (one per configuration). The medians of these distributions recapitulated the latency estimated from the data (180 and 190 ms for the Target-in-RF and RDM-in-RF respectively). We report the p-value of a rank sum test (2-tailed) using the bootstrap derived distributions to evaluate the null hypothesis that the latencies are the same for the two configurations. We obtained the same result by sampling neurons (instead of trials), with replacement.
We quantified the effect of motion strength on the rate of increase of neural response (‘buildup rate’) during the decision-making epoch as the slope of the response in the time window 180 to 380 ms after stimulus onset (Figure 3D). To exclude pre-saccadic activity, we discarded from each trial, the spikes occurring up to 100 ms before saccade onset. We computed by least squares method, the slope for each neuron at each coherence from the mean detrended response in 10 ms time bins in the aforementioned time window. We then tested whether these buildup rates scaled with coherence across the population in each stimulus configuration by fitting a linear model regressing these buildup rates against signed coherence. We confirmed that the trends shown in Figure 3D were preserved when the analysis was done using weighted regression.
Leverage of neural activity on behavior: (Figure 4)
We measured the leverage of neural activity on the animal’s choice in two ways. First, we fit the monkey’s choices with logistic regression where P+ is the probability of choosing the ‘positive’ direction target, C is signed coherence and R is the z-scored mean neural response in the time window 100 to 300 ms before saccade. If the variations in firing rate of the neurons have leverage over choice even when the effect of motion coherence is accounted for, then β2 ≠ 0. We compared β2 across configurations with a signed rank test on their absolute values. We also quantified the additional leverage of the neural responses on choice beyond that of the motion strength, by measuring the difference in the deviance of the full model and the model without the R term (Λ). Comparisons of Λ provided similar results to the comparisons of the β2 term that are presented in the results.
Second, we quantified the trial-by-trial correlations between neuronal response and the animal’s choice in the 0% coherence trials by computing ‘choice probability’ (CP) (Britten et al., 1996). For each neuron, we computed the mean responses on the 0% coherence trials in a time window 100 to 300 ms preceding the saccade. The trials were separated into two groups based on the animal’s choice. We used the distributions of responses from the two groups to calculate the area under the ROC, termed the choice probability (CP). We evaluated the null hypothesis that |CP-0.5|=0 using a permutation test. We permuted the union of responses from both groups and assigned them randomly to the two choices (matching the number of trials in each group) and computed the CP. By repeating this procedure 2000 times, we established the distribution of |CP-0.5| under H0 and report the p value as the area to the right of the observed CP minus 0.5.
To evaluate whether the CPs from the two configurations were different, we first converted responses to z-scores (by neuron and configuration) and then combined the z-scores across neurons. We then computed two CPs, as above, for the two configurations. To evaluate the null hypothesis that the two CPs are equal, we performed another permutation test, this time preserving the association with choice but permuting the association with configuration. We obtained the distribution of the difference in CP (|ΔCP|) under H0 from 2000 repetitions of the permutation procedure and report the p value as the area of this distribution that is greater than the observed |ΔCP| from the data.
We also quantified the correlation between the buildup rates and RT. We used trials in which the monkey chose the ‘positive’ direction target, including all such trials at 0% motion strength and only correct trials at positive motion strengths. For each trial, we computed the slope of the response between 180-420 ms after RDM onset (using 40 ms time bins) from the detrended responses. To remove the effect of coherence on RT, we standardized (i.e., z-scored) both the RTs and the buildup rates within each coherence and computed the correlation between them.
Variance and correlation analysis
To evaluate if the neuronal firing rates on individual trials during the decision-making epoch reflect a process of accumulation of noisy evidence, we analyzed the pattern of variance and autocorrelation of the responses (Churchland et al., 2011; de Lafuente et al., 2015). We were interested in the variance attributable to such an accumulation process. For the ith time bin, this variance (s2<Ni>) is the fraction of the total measured variance (s2Ni) remaining after accounting for the point process variance (PPV), that is, the variance expected even if the underlying rates were constant. Assuming the PPV is proportional to the mean count, where φ is a constant that must be estimated. Since our goal was to compare how well the firing rates conform to a diffusion process, we allowed φ to be a free parameter and fit it to obtain the best conformity to the autocorrelation pattern for a running sum of independent, identically distributed random numbers. For an unbounded diffusion process, the correlation between the ith and jth time steps is
We characterized the variance and autocorrelation from six 60 ms time bins between 180-540 ms after stimulus onset, ignoring any time bins that extended to within 100 ms of the saccade. To pool data across neurons, we used the residuals for each trial as follows. The mean response of a trial in each time bin was subtracted from the mean of the responses from all the trials for that neuron for the same signed coherence in that time bin. We computed the covariance matrix from the residuals for the six time bins. We used an initial guess for φ to calculate the variance (Eq. 5) and substituted this along the diagonal of the covariance matrix.
The correlation was derived from this covariance matrix by dividing each term ij by We used Nelder-Mead simplex method (MATLAB function ‘fminsearch’) to find the φ that minimized the sum of squares of the difference between the z-transformed calculated correlation and the z-transformed theoretically predicted correlation. Note that the values of φ were not constrained to be the same in the Target-in-RF (φ = 0.42) and RDM-in-RF (φ = 0.39) configurations even though theoretically they should be.
We computed the variance for the six time bins using the φ values from the fit and estimated the standard errors from a bootstrap (Figure 5). We then evaluated the effect of time on the variance using least squares regression. We performed these analyses over a range of plausible values of φ and confirmed that only the absolute values of the variances differed, whereas the shape of the variance function over time was unaffected. We similarly computed the variance and its standard error for time bins aligned to the onset of the saccade. We used a t-test to compare the variance in the two time bins immediately preceding the saccade.
To quantify how well the measured correlation values conform to theoretical predictions, we formed a sum of square (SS) statistic from the 15 pairs of observed and theoretical correlations (after Fisher-z transformation, Figure 6D-E). We used a bootstrap procedure to estimate the distribution of this statistic by sampling with replacement from the data and following the steps above (100 iterations). We used a Kalmogorov-Smirnov test to determine the significance of the difference between the distribution of the SS statistics between the RDM-in-RF and the Target-in-RF configurations.
Model
We simulated the spike rates of three neural populations during the RDM epoch — one population with the RDM in their RF and two with targets in their RF (Figure 7A). The Dots-in-RF population was modeled as having an exponential rise in firing rate starting at 50 ms after RDM onset and peaking at 130 ms (Figure 7C). It then maintained the peak response until 500 ms. The two Target-in-RF populations were modeled as maintaining a steady response (R0) up to 180 ms after RDM onset and then following drift diffusion dynamics until 500 ms (Figure 7B). The response R at time t in the dynamic epoch was where κ is the deterministic drift component and N(t), the diffusion component — a running sum of random numbers sampled at every time step (Δt) from a normal distribution of zero mean and variance of V2Δt (V = 40). The drift component was positive for one target population and negative for the other. We then implemented divisive suppression between the three populations of the form where R’ and R denote the unsuppressed and suppressed responses, respectively, of the population indicated by the subscript, and ωij is the weight of the influence of the ith population on the jth.
The parameters were chosen so that the simulated mean responses of the suppressed population would approximate the observed mean population firing rates at the 25.6% coherence condition (Figure 7F-G). We first estimated the suppression of two target populations on each other (ωT1T2 and ωT2T1) from the peak and steady state responses of the neurons to the appearance of a target in their RF. We then estimated the weight of suppressive influence of the RDM-in-RF population on the Target-in-RF populations (ωDTX) using the firing rates at the trough of the response dip following the onset of RDM (arrow in Figure 7G). The influences of the two Target-in-RF populations on the RDM-in-RF population ωTxD were adjusted around ωDTx to mimic the observed separation in mean responses of the RDM-in-RF population to the two directions of motion. Such asymmetry of the influence of the two Target-in-RF populations might arise from the different spatial relationship they might have with the RDM-in-RF population. Such asymmetries are likely for the other pairs of ωs too, but we set them to be equal here to simplify the model. We used the weights of suppression to estimate the underlying unsuppressed mean responses of each of the populations (Figure 7B-C).
We simulated 10,000 trials implementing divisive suppression between the three populations. For simplicity, we did not implement any temporal dynamics to the suppression and computed it based on the responses in the preceding time window. The weight of suppression varied every 10 ms, and was drawn from a normal distribution whose mean was the estimated weight ωij and the variance 0.3ωij.
Competing Interests
None to declare
Footnotes
↵§ Deceased