Abstract
It is considered that endogenous (voluntary) attention acts via top-down, and exogenous (involuntary) attention via bottom-up mechanisms, and that both affect visual areas similarly. Using an fMRI ROI-based analysis for occipital areas, we measured average fMRI activity for valid (target at cued location) and invalid (target at un-cued location) trials, for pre- or post-cueing in the endogenous and exogenous conditions independently, with same observers and task. The results show: (1) stronger effect for both types of attention in contralateral than ipsilateral regions to the attended hemifield; (2) higher fMRI activity at the valid-than invalid-cued locations; (3) increasing modulation of fMRI activity along the visual hierarchy for endogenous, but constant modulation for exogenous, pre-cueing; (4) constant modulation of endogenous along the visual area hierarchy, but no modulation for exogenous, postcueing. Endogenous and exogenous attention distinctly modulate activity in visual areas due to their differential engagement of top-down and bottom-up processes.
INTRODUCTION
Spatial, covert visual attention is the selective processing of visual information in space, without change in gaze. Attention can be allocated voluntarily–endogenous attention– or involuntarily–exogenous attention. These two types of attention often have similar perceptual consequences (reviews by Carrasco 2011; Carrasco and Barbot 2015), but notable exceptions exist (e.g. Carrasco et al. 2006; Yeshurun et al. 2008; Barbot and Carrasco 2017). Endogenous and exogenous attention have each been investigated using neuroimaging, but debates remain concerning the underlying neural networks (see Chica et al. 2013; Beck and Kastner 2014) including the role of the temporo-parietal junction (TPJ; Geng and Vossel 2013; Dugué, Merriam, et al. 2017). Classically, researchers have described endogenous attention as a top-down process, and exogenous attention as a bottom-up process (Posner et al. 1980; Nakayama and Mackeben 1989; Corbetta and Shulman 2002; Beck and Kastner 2009; Carrasco 2011). This characterization stems from fMRI studies investigating these types of attention separately.
Attention alters basic visual processes, such as contrast sensitivity and spatial resolution, which are computed in early visual cortex (Carrasco and Yeshurun 2009; Carrasco 2011; Anton-Erxleben and Carrasco 2013; Carrasco and Barbot 2015). fMRI studies have shown that endogenous attention causes a baseline shift in early visual areas (e.g. Kastner et al. 1999; Somers et al. 1999; O’Connor et al. 2002; Buracas and Boynton 2007; Murray 2008; Herrmann et al. 2010; Pestilli et al. 2011; review by Beck and Kastner 2014) and increases the dynamic range of fMRI responses (Li et al. 2008; Lu et al. 2011). Additionally, single unit monkey neurophysiology has shown effects of covert attention in occipital areas (e.g. McAdams and Maunsell 1999; Reynolds et al. 2000; Martínez-Trujillo and Treue 2002; Williford and Maunsell 2006; Mitchell et al. 2009; Reynolds and Heeger 2009; Ruff and Cohen 2014; Luo and Maunsell 2015). Comparatively, little is known about the effect of exogenous attention on visual areas both from single-unit (Busse et al. 2008; Wang et al. 2015) and fMRI (Liu et al. 2005; Müller and Kleinschmidt 2007; Müller and Ebeling 2008; Heinen et al. 2011; Mulckhuyse et al. 2011) studies.
Typically, attention is manipulated by presenting a pre-cue, prior to the target. Endogenous post-cues, presented after target offset, can also improve performance by affecting the information readout (Kinchla et al. 1995; Nobre et al. 2004; Ruff et al. 2007; Hulme et al. 2009). Exogenous post-cues affect performance in some tasks (Sergent et al. 2013), but not in others (Carrasco and Yeshurun 1998; Gobell and Carrasco 2005; Anton-Erxleben et al. 2007; Fuller et al. 2009). Neuroimaging studies of endogenous attention have shown that post-cues modulate fMRI activity in early visual areas (Nobre et al. 2004; Vogel et al. 2005; Hulme et al. 2009; Pestilli et al. 2011; Sergent et al. 2011), but the only study evaluating post-cues in exogenous attention showed no such modulation (Liu et al. 2005), and no single study has directly compared post-cues in endogenous and exogenous attention.
Studies of endogenous and exogenous attention have focused on parietal and frontal areas; it is often assumed that the effects of these types of attention are the same in striate and extra-striate areas (Corbetta and Shulman 2002; Peelen et al. 2004; Corbetta et al. 2008; Beck and Kastner 2014). Furthermore, these two types of attention have not been appropriately compared: the experimental tasks, stimuli, and task difficulty have differed and in these comparisons the behavioral effects have rarely been concurrently assessed. Thus, it remains unclear whether the mechanisms underlying these two types of attention rely on top-down and bottom-up projections, respectively.
Here, with the same participants, task and stimuli for all attention manipulations, we tested the following 4 hypotheses: (1) Pre-cueing should induce an attentional modulation of fMRI activity, higher in the valid than the invalid condition in which attention needs to be reoriented to the opposite location to perform the task (Liu et al. 2005; Natale et al. 2009; Shulman et al. 2009; Dugué, Merriam, et al. 2017); (2) both endogenous and exogenous pre- and post-cueing effects should be stronger in contralateral than ipsilateral visual regions relative to the attended hemifield (Liu et al. 2005; Pestilli et al. 2011); (3) such modulation should increase along with the visual hierarchy (e.g. higher in V4 than in V1) for endogenous attention (Kastner et al. 1999; Maunsell and Cook 2002; Pestilli et al. 2011; Chica et al. 2013; Beck and Kastner 2014), but this should not be the case for exogenous attention. The logic behind this proposal is that for endogenous attention, a top-down process, modulations from higher-order, attentional regions will send feedback information to visual cortex with diminishing effects in earlier visual areas, given the increased distance from the source; (4) post-cueing should induce an attentional modulation of fMRI activity in early visual areas for endogenous attention. Indeed, we expect that voluntary, endogenous attention would facilitate reading out perceptual information, and modulate its processing (Nobre et al. 2004; Vogel et al. 2005; Hulme et al. 2009; Pestilli et al. 2011; Sergent et al. 2011), whereas it would not be the case of exogenous attention (Carrasco and Yeshurun 1998; Liu et al. 2005), but see (Sergent et al. 2013; Thibault et al. 2016).
To test these four hypotheses, we compared the effects of endogenous and exogenous attention in early visual areas while the same observers performed the same task. We measured fMRI activity while observers performed a 2-AFC orientation discrimination task, contingent upon contrast sensitivity (Nachmias 1967; Carrasco et al. 2000; Pestilli et al. 2009), with a fully-crossed design: two attention conditions—endogenous or exogenous attentional orienting—and two types of cueing—pre- or post-cue. Moreover, given the ubiquitous performance tradeoffs at attended (benefits) and unattended (costs) locations compared to a neutral condition (e.g. Pestilli and Carrasco 2005; Giordano et al. 2009; Montagna et al. 2009; Herrmann et al. 2010), we evaluated fMRI activity at both the cued and the un-cued locations.
The results, which confirmed all four hypotheses, suggest that endogenous and exogenous attention distinctly modulate activity in retinotopic early visual areas due to their differential engagement of top-down and bottom-up processes and their respective temporal dynamics. We discuss how these retinotopically specific neural correlates further our understanding of visual attention.
MATERIALS & METHODS
The methods employed in this study as well as their description are identical to those we reported in a recent study, in which we compared activity in TPJ during orienting and reorienting of endogenous and exogenous attention (Dugué, Merriam, et al. 2017). We used optimal spatial and temporal parameters to maximize the effects of these two types of attention, i.e. the benefits at the attended location and costs at the unattended location, on an orientation discrimination task. The same observers performed the same discrimination task under endogenous and exogenous attention to enable direct comparison between these conditions (reviews by Carrasco 2011; Carrasco and Barbot 2015).
Observers
Five observers (three female, 24-30 years-old) with normal or corrected-to-normal vision participated in the study. Observers provided written informed consent. The University Committee on Activities Involving Human Subjects at New York University approved the experimental protocol. Each observer participated in 10 scanning sessions: one session to obtain a set of three high-resolution anatomical volumes, two sessions for retinotopic mapping, three sessions for the exogenous attention condition and three sessions for the endogenous attention condition (order counterbalanced between observers). Prior to the first scanning session of each attention condition, observers performed several practice sessions outside the scanner.
Stimuli
Stimuli were generated using MATLAB (MathWorks) and the MGL toolbox (Gardner et al., 2018a) on a Macintosh computer. Stimuli were displayed on a flat-panel display (NEC, LC-XG250 MultiSync LCD 2110; resolution: 1024 × 768 pixels; refresh rate: 60 Hz) housed in a Faraday box with an electrically conductive glass front, positioned at the rear of the scanner bore. Observers viewed the display through an angled mirror attached to the head coil, at a viewing distance of 172 cm. The display was calibrated and gamma corrected using a linearized lookup table. A white fixation cross (0.3°) was present at the center of the screen throughout the experiment. The stimuli consisted of two gratings (4 cpd) windowed by raised cosines (3° of diameter; 7% contrast). The stimuli were presented in the bottom quadrants (5° horizontal eccentricity; –2.65° azimuth). Endogenous cues were white rectangles (0.7°), positioned adjacent to the fixation cross indicating one of the two lower quadrants (0.35° horizontal eccentricity from the edge of the fixation cross, and 0.35° azimuth). Exogenous cues were also white rectangles (0.7°), but were positioned adjacent to an upcoming grating stimulus, above the horizontal meridian (1° away from the edge of the grating stimulus; and the edge of the cue 4.44° horizontal eccentricity from the edge of the fixation cross) and vertically aligned with the stimulus.
Behavioral procedure
A single trial lasted 1700 ms for the exogenous attention condition and 1900 ms for the endogenous attention condition (Figure 1; note that for illustration purposes, the display is not at scale). In 40% of the trials (pre-cue condition), a cue was shown first, followed by the pair the gratings. In 40% of the trials (post-cue condition), the order of presentation of the cue and the gratings was reversed. In 10% of the trials, the gratings were not presented (‘cue-only’ trials). In the remaining 10% of the trials, neither a cue nor the gratings were presented (‘blank’ trials). For both pre-cue and post-cue trials, observers were instructed to report the orientation of a target grating, i.e., clockwise or counter-clockwise compared to vertical, by pressing one of two keys. For cue-only and blank trials, observers were asked to press a third key.
Cues (both exogenous and endogenous) were presented for 60 ms, indicating either the left or the right grating location. The inter-stimulus interval (ISI) between the pre-cues and the gratings was 50 ms for exogenous cues and 250 ms for endogenous cues, resulting in stimulus-onset asynchronies (SOA) of 110 ms and 300 ms, optimal delays to manipulate exogenous and endogenous attention respectively and maximize their behavioral consequences (Nakayama and Mackeben 1989; Mackeben and Nakayama 1993; Liu, Stevens, et al. 2007; Müller 2014). The behavioral effects of endogenous attention are sustained (e.g. Ling and Carrasco 2006) and can thus still be present in later brain activity, as shown in ERP studies (e.g. Seiss et al. 2009). Further, the brain responses elicited by exogenous and endogenous cues are different during 300 ms following cue onset. The gratings were shown for 50 ms. For the post-cue trials we used the same timings but inverted the order of presentation, so that the cues followed the stimuli (e.g. Kinchla et al. 1995; Yaffa Yeshurun 1998; Liu et al. 2005; Pestilli et al. 2011). A response cue, presented for 800 ms, at the end of the trial indicated which of the two gratings was the target (50% of the trials at each location). Note that in all four trial conditions (exogenous/endogenous, pre-/post-cueing), the response cue appeared after both the cue and the stimuli had disappeared. The maximum delay between the stimuli offset and the response cue onset was brief (∼400 ms max in the endogenous condition). This time interval is less than typically associated with a demand for working memory (>600 ms; Phillips 1974). Visual feedback was provided immediately following each trial. The fixation cross turned green or red to indicate a correct or incorrect response, respectively. If observers had not pressed any key after 530 ms, the fixation cross did not change color indicating that they missed the response window.
Exogenous, peripheral cues were not informative regarding the target location or orientation; the cue and the target location matched in 50% of the trials (valid trials), but not in the other 50% of the trials (invalid). Endogenous, central cues were informative of the target location but not its orientation; cues pointed towards the target in 75% of trials (valid trials), but not in the remaining 25% of trials (invalid).
Each of the 3 sessions of the exogenous condition and 3 sessions of the endogenous consisted of 14 runs of 40 trials each, as well as an additional run of stimulus localizer (see MRI procedure). Prior to the first session of each attentional scanning condition, observers performed two practice sessions outside the scanner. The tilt of the grating was adjusted so that each observer would achieve ∼80% correct performance in the valid trials. During the scanning sessions, the tilt was adjusted between runs to maintain this overall performance level. Eye position was monitored using an infrared video camera system (Eyelink 1K, SR Research, Ottawa, Ontario, http://www.sr-research.com/EL_1000.html). Trials in which the observers blinked or broke fixation (1.5° away from fixation) at any point during the trial sequence (from fixation onset to response cue offset; 13% ± 4% of the trials on average across all observers) were identified and removed from the behavioral analysis (see Results), and regressed separately in the MRI analysis (see below).
MRI Procedure
MRI scanning
Imaging was conducted on a 3T Siemens Allegra head-only scanner (Erlangen, Germany). Padding was used to minimize observers’ head movements. Anatomical images were acquired using a Siemens NM-011 head coil to transmit and receive. In a single scanning session for each observer, three high-resolution anatomic images were acquired using a T1-weighted magnetization-prepared rapid gradient echo (MPRAGE) sequence (FOV = 256 × 256 mm; 176 sagittal slices; 1 × 1 × 1 mm voxels). These three images were co-registered and averaged. We then used the public domain software FreeSurfer (http://surfer.nmr.mgh.harvard.edu), to segment the gray matter from these averaged anatomical volumes. All subsequent analyses were constrained only to voxels that intersected gray matter.
Functional images were acquired with a receive-only 8-channel surface coil array (Nova Medical, Wilmington, MA). T2*-weighted echo-planar imaging sequence (TR = 1750 ms; TE = 30 ms; flip angle = 90°) measured blood oxygen level-dependent (BOLD) changes in image intensity (Ogawa et al. 1990). One volume contained 28 slices oriented 45° to the calcarine sulcus and covered the occipital and posterior parietal lobes (FOV = 192 × 192 mm; resolution = 2 × 2 × 2.5 mm; no gap). In each session, we acquired T1-weighted anatomical images in the same slices as the functional images (spin echo; TR = 600 ms; TE = 9.1 ms; flip angle = 90°; resolution = 1.5 × 1.5 × 3 mm). The in-plane images were used to align functional images from different sessions to the same high-resolution anatomical volume for each participant, using a robust image registration algorithm.
Pre-processing of the MRI data
Imaging data were analyzed using mrTools (Gardner et al., 2018b) and custom software written in MATLAB. The first eight volumes of each run were discarded to allow longitudinal magnetization to reach steady-state. The measurements of the B0 static magnetic field performed in each session were used to correct for spatial distortion. Pre-processing of the functional data included motion correction, linear trend removal, and temporal high-pass filtering (cutoff: 0.01 Hz) to remove low-frequency noise and slow drifts in the fMRI signal.
Retinotopic mapping
Retinotopic mapping procedures followed well-established methods using conventional traveling-wave, phase-encoded methods. Phase maps of polar angle were measured using clockwise and counter-clockwise rotating checkerboard wedges, and eccentricity maps were measured using concentric and eccentric checkerboard rings (Engel et al. 1994; Sereno et al. 1995; Engel et al. 1997; Larsson and Heeger 2006). Figure 2 (left panel) shows the borders of visual areas drawn by hand on flattened surface representations of the brain following published conventions (Engel et al. 1997; Larsson and Heeger 2006; Liu, Larsson, et al. 2007; Wandell et al. 2007).
Stimulus localizer
The stimuli were the same size, spatial frequency, and location as those in the main experiment, but they were at full contrast, and their orientation and phase were randomly changed every 200 ms to avoid adaptation. A localizer run consisted of a two-condition block alternation protocol: 16 cycles, each cycle was 17.5 s (8.75 s with the stimuli on, 8.75 s stimulus off). Observers completed one localizer run in each scanning session of the main experiment (6 runs overall, 4 min each). Observers were instructed to fixate a cross at the center of the screen throughout each run. Data were averaged across the 6 runs and analyzed using the same methods as for the retinotopic mapping scans, to define the cortical representation of the gratings. Each retinotopic ROI was further restricted to voxels that responded positively during the blocks when the grating stimuli were presented. A sinusoid was fit to the fMRI time series from each voxel. To be conservative, voxels were included in the ROI if the best-fit sinusoid had a phase value between 0 and pi, and if the correlation (technically, coherence) between the best-fit sinusoid and the time series was greater than 0.2 (Figure 2, right panel). In addition, we conducted the analysis without restricting the ROI to this coherence level, yielding similar results that supported the same conclusions.
Event-related analysis
fMRI time series in the main experiment were averaged across voxels in each ROI (separately for each hemisphere) and then concatenated across runs. The data were denoised using GLMDenoise (Kay et al. 2013). fMRI response amplitudes were then computed from the denoised data using linear regression, with twelve regressors: right and left valid pre-cue, right and left invalid pre-cue, right and left valid postcue, right and left invalid post-cue, right and left cue-only, blank (no cue nor stimulus) and eye-movements (trials in which observers broke fixation or blinked). The resulting fMRI response amplitudes (for correct trials only) were then averaged across observers, separately for each ROI and separately for each hemisphere.
RESULTS
Endogenous and exogenous attention improve performance
Attention improved orientation discrimination (both accuracy and reaction time), with no evidence of a speed-accuracy trade-off (Figure 3). Observers performed a 2-AFC orientation-discrimination task under two attentional conditions (exogenous or endogenous attention), when the cue was presented either before (pre-cue) or after (post-cue) the grating stimuli (see Methods, Figure 1), and while their brain activity was measured with fMRI. The cue was either valid or invalid (50/50% of the time in the exogenous condition and 75/25% in the endogenous condition, respectively).
We calculated performance accuracy (d’) in each condition, for each observer separately (Figure 3, top row). We conducted a three-way repeated measures 2×2×2 ANOVA (exogenous/endogenous × valid/invalid × pre/post-cue). There was better performance for valid than invalid cues (F(1,4)=23.6, p=0.008). Exogenous versus endogenous cues were statistically indistinguishable (F(1,4)<1), and there was no evidence for a difference between pre-versus post-cues (F(1,4)<1). None of the two or three-way interactions were significant. Reaction time was also calculated in each condition, for each observer separately (Figure 3, bottom row). The corresponding three-way ANOVA revealed that reaction times were faster for valid than invalid cues (F(1,4)=62.3, p=0.001). There was no evidence for a difference between exogenous and endogenous cues (F(1,4)=2.7, p=0.17), nor for pre-cues and postcues (F(1,4)=1.7, p=0.27). There were two significant interactions, indicating that the difference between pre-cues and post-cues (F(1,4)=8.1, p=0.047) and between valid cues and invalid cues (F(1,4)=16.2, p=0.02) were more pronounced for endogenous than for exogenous attention.
Attention modulated activity in visual cortex
For each ROI, i.e. V1, V2, V3 (for V2 and V3, ventral and dorsal ROIs were averaged), V3A, hV4 and LO1, we measured the fMRI response amplitudes for each attentional condition – exogenous and endogenous – and each cueing condition – pre- and post-cueing, for the contralateral and ipsilateral side to the cued location (Figure 4). fMRI responses were significantly larger for valid than invalid cues, for both endogenous and exogenous cues, and for both pre- and post-cues (Figures 4 and 5).
We first analyzed the fMRI responses evoked by each type of attention as a function of the contralateral and ipsilateral brain activity relative to the cue location (Figure 5). ANOVAs indicated that there was higher contralateral than ipsilateral activity across brain areas (endogenous: F(1,4)=59.9, p=0.0015; exogenous: F(1,4)=218.8, p=0.0001). For both types of attention, this difference was more pronounced for valid than invalid cues (endogenous: F(1,4)=8.6, p=0.04; exogenous: F(1,4)=21.1, p=0.01).
We plot the same data but now only showing the fMRI activity in the contralateral ROI to the cued location for each experimental condition (Figure 5). ANOVAs conducted separately for each ROI revealed a main effect of validity, i.e. higher activity for valid than invalid cued locations at all areas (p<.05), except V1.
Endogenous and exogenous attention differed in terms of their respective effects across the hierarchy of visual cortical areas. This is illustrated in Figure 6, in which we plot the differences between valid and invalid cueing for each type of attention. This figure shows the areas for which activity differed: t-tests revealed that for both endogenous (p < 0.05 for V3a, V4 and LO1; trend for V3: p = 0.077) and exogenous (p < 0.05 for V3, V3a, V4 and LO1; trend (p = 0.074) for V2) attention pre-cues elicited greater fMRI activity for valid than invalid cues. In the endogenous pre-cueing condition, the activity difference evoked by valid and invalid cues increased along the hierarchy of the visual areas (top, left panel). But that was not the case for exogenous attention; there were no reliable differences across the visual hierarchy in the enhanced activity caused by pre-cueing (top, right panel). Furthermore, t-tests revealed that for endogenous attention (p < 0.05 for V2, V3, V3a, V4 and LO1; trend (p = 0.079) for V1; bottom, left panel), but not for exogenous attention (all p > 0.1; bottom, right panel), post-cues elicited greater fMRI activity for valid than invalid cues in occipital areas.
Lastly, we evaluated the degree to which inter-individual variability in behavioral responses co-varied with variability in fMRI responses. We found a positive correlation between fMRI activity and behavioral performance (Figure 7). We computed the correlation, across observers, between the fMRI responses (percent change in image intensity) and behavioral performance accuracy (d’). For both types of attention, we found a positive correlation between fMRI activity and d’ (endogenous: Pearson correlation r = 0.3, p = 0.003; exogenous: r = 0.2, p = 0.02).
DISCUSSION
This is the first study to compare endogenous and exogenous attention in visual cortex, while concurrently assessing visual performance on a discrimination task using well-established psychophysical protocols to manipulate attention. The fact that the same observers performed the same task with the same stimuli under different attentional manipulations enabled us to isolate the fMRI activity induced by each type of attention.
The few previous studies that have compared endogenous and exogenous attention have focused on parietal and frontal areas. It is often assumed that the effects of these types of attention are the same in striate and extra-striate areas (Corbetta and Shulman 2002; Peelen et al. 2004; Corbetta et al. 2008; Beck and Kastner 2014). The conclusions that can be drawn from the few studies that have compared them are limited because they have used different stimuli and/or tasks for each type of attention and have not assessed behavioral performance while measuring fMRI activity. For instance, Kincade et al. (2005) used reaction time detection tasks, in which the attention effects could have affected discriminability, speed of processing or criterion (Reed 1973; Wickelgren 1977; Carrasco and McElree 2001), and they did not monitor eye position while observers performed the task in the scanner (e.g. Mayer et al. 2004; Peelen et al. 2004; see Dugué, Merriam, et al. 2017 who published a table summarizing these methodological problems for studies regarding TPJ activation).
To further our knowledge of the neural correlates of attention, we investigated both attentional orienting (valid cueing) and reorienting (invalid cueing), critical in an ever changing environment (Dugué et al. 2016; Dugué, Xue, et al. 2017). Furthermore, given ubiquitous performance tradeoffs between attended (benefits) and unattended (costs) locations (e.g. Pestilli and Carrasco 2005; Giordano et al. 2009; Montagna et al. 2009; Herrmann et al. 2010), we assessed activity at both attended (contralateral ROI) and unattended (ipsi-lateral ROI) locations. Finally, we investigated how attentional effects varied as a function of pre- and post-cueing, thus contrasting the neural correlates of perception with those of post-perceptual processing of information.
There was an overall positive correlation between performance in the orientation discrimination task and the degree of attentional modulation indexed by modulation in fMRI activity. This pattern indicating that as discriminability increases so does the attentional modulation in fMRI activity is expected, but only very few studies have reported such a correlation (e.g. Liu et al. 2005). The enhanced performance brought about by the valid, uninformative peripheral pre-cue is consistent with an automatic, bottom-up involuntary capture of exogenous attention, which has been shown in several psychophysical studies (e.g. Dosher and Lu 2000; Carrasco et al. 2004; Pestilli and Carrasco 2005; Giordano et al. 2009; Herrmann et al. 2010). The enhanced performance brought about by the valid, informative central precue is consistent with a top-down, voluntary deployment of endogenous attention (e.g. Dosher and Lu 2000; Ling and Carrasco 2006; Giordano et al. 2009; Liu et al. 2009; Herrmann et al. 2010).
In the endogenous attention condition, there was an increase in attentional modulation of stimulus-evoked activity along the hierarchy of visual areas. Such a pattern is consistent with previous studies suggesting that endogenous attention is a top-down modulation from frontal and parietal areas feeding back to visual cortex, with diminishing effects in earlier visual areas (Kastner et al. 1999; Maunsell and Cook 2002; Kastner and Pinsk 2004; Chica et al. 2013). Contrary to previous studies (e.g. Boynton et al. 1999; Brefczynski and DeYoe 1999; Somers et al. 1999; Herrmann et al. 2010; Pestilli et al. 2011), no attentional modulation was evident in V1. It might be that attentional modulation of V1 activity is more variable than other visual cortical areas, making it harder to detect (see also Kastner et al. 1999; Liu et al. 2005). Methodological differences between this and previous studies may have contributed to weakening the effect of attention in V1. The accrual time in the current endogenous condition was relatively short (1300 ms in the valid condition and 500 ms in the invalid condition) compared to previous studies investigating endogenous, voluntary attention, in which the cue and/or stimuli were presented for a long duration to maximize BOLD measurements (e.g. Boynton et al. 1999; Brefczynski and DeYoe 1999; Somers et al. 1999; Pestilli et al. 2011). This short accrual time may have limited the effects of attentional feedback to V1.
In the exogenous attention condition, in contrast to the endogenous attention, the attentional modulation did not increase across the visual hierarchy. Previous studies have reported a similar effect (Müller and Kleinschmidt 2007; Müller and Ebeling 2008) others a decrease (Heinen et al. 2011) and yet others an increase (Liu et al. 2005; Mulckhuyse et al. 2011) across the visual areas. This difference might be explained by different task parameters. For example, in the Liu et al. (2005) study, observers knew which of the two stimuli was the target they had to discriminate as soon as the stimuli were displayed; one stimulus was vertical and the other was tilted to the left or the right. In the present study, both stimuli were independently tilted and observers did not know which one was the target and which one was the distractor until later when the response cue appeared.
A previous study comparing endogenous and exogenous attention conditions to a neutral condition claims that the effect on early visual areas, specifically in right LO, was stronger in the endogenous than the exogenous condition (Kincade et al. 2005). However, this comparison is problematic because the stimuli used in the three conditions differed, and observers did not perform the task during the scanning sessions. Another study also reported a stronger effect for endogenous than exogenous attention in early visual areas, specifically in the cuneus and the middle and superior occipital gyri (Mayer et al. 2004). However, the dependent, behavioral variable was reaction times (RT) in a detection task. Yet, any differences in RT can be due to changes in speed of processing, discriminability, and/or decision criteria (Reed 1973; Wickelgren 1977; Carrasco and McElree 2001).
Unlike in the pre-cueing condition in which endogenous the attention effect increased along the processing stream, the endogenous post-cueing effect remained constant across these visual areas. The constant effect in the post-cue condition could be due to the contribution of two factors: (1) the fMRI response evoked by the stimulus in early visual areas may decrease along the visual hierarchy; (2) the top-down modulations from frontal and parietal areas feedback to visual cortex with diminishing effects in earlier visual areas. In the exogenous condition, there was no significant post-cueing effect on early visual areas. This result is consistent with that of Liu et al. (2005), who while evaluating exogenous attention effects on occipital cortex included a post-cue condition to rule out sensory contamination of the cue contributing to the enhanced BOLD activity found in their pre-cue condition.
The ROI-based analysis that we followed here enabled us to compare contralateral and ipsilateral modulation of BOLD activity, thus providing additional information regarding the differences in processing dynamics of both types of attention. We observed a larger difference between contralateral and ipsilateral areas for the valid than the invalid cueing condition. This effect could be due to the fact that for the former, observers were attending to the same location throughout the trial, whereas for the latter, when the response cue did not match the pre-cue, observers had to switch their spatial attention to the opposite stimulus location, thus activity at that new location would be accumulated for less time. For instance, for endogenous attention, for the valid condition observers had been processing the target for almost 500 ms before the response cue appeared (see Figure 1). When the response cue matched the pre-cue, observers continued processing and reading out the signal from that location for up to 800 ms (they were not allowed to give an answer before the end of the response cue period). But when the response cue did not match the pre-cue, then observers had to switch after 500 ms to the other location (ipsilateral) thus accumulating less activity. Similarly for exogenous attention, in the invalid cue condition, the accumulation time is only about 300 ms. This accrual time explanation could also account for the larger difference between contralateral and ipsilateral for pre-cues than post-cues, i.e. there is a 300 ms accumulation when the exogenous pre-cue is invalid, while only 100 ms when the post-cue in invalid. Likewise, the larger modulatory effect for endogenous relative to exogenous attention is consistent with the difference in accrual time.
The results of the present study complement a recent study (Dugué, Merriam, et al. 2017) using the same data set acquired simultaneously in which we demonstrated that subregions of the Temporo-Parietal Junction (TPJ) that respond specifically to visual stimuli are more active when attention needs to be spatially reoriented (invalid cueing) than when attention remains at the cued location (valid cueing), and that partially overlapping specific visual sub-regions mediate reorienting after orienting of endogenous or exogenous attention. Together, these two studies provide a comprehensive investigation of endogenous and exogenous attention, and pave the way for rigorous psychophysics informed, neuroimaging studies of covert, spatial attention. Here, because the slice prescription covered only a limited portion of the brain, we concentrated the analysis on visual cortical areas in the occipital lobe. In the future, we plan to perform a single fMRI study in which both the early visual areas and the parieto-frontal areas are simultaneously measured with high resolution to evaluate the relative contribution of each region in attentional orienting and reorienting, as well as possible interactions among regions in the network. The present findings further our knowledge of the neurophysiological bases of covert attention and have implications for models of visual attention, which should take into account not only the similarities, but also the differences reported here.
In conclusion, the present results show some similarities and reveal important differences in the specific neural correlates of endogenous and exogenous attention on early vision: An increasing modulation of fMRI activity for endogenous attention, but constant modulation for exogenous attention, along the hierarchy of visual occipital areas. We also found reliable and constant modulation of fMRI activity for post-cueing endogenous at occipital are-as but not for exogenous attention, which suggests that endogenous attention facilitates both the encoding and the readout of visual information whereas exogenous attention only facilitates the encoding of information.
Acknowledgments
This work was supported by: NIH RO1-EY019693 to MC and DJH; NIH RO1-EY016200 to MC; the FYSSEN Foundation and the Philippe Foundation to LD; and the Center for Brain Imaging of New York University. We want to thank members of the Carrasco lab for constructive comments on the manuscript. The authors declare no conflicts of interest.