ABSTRACT
Recently, several studies have demonstrated that visual stimulus routing is subserved by inter-areal gamma-band synchronization, whereas top-down influences are mediated by alpha-beta band synchronization. These processes may implement top-down control if top-down and bottom-up mediating rhythms are coupled via cross-frequency interaction. To test this possibility, we investigated Granger-causal influences among awake macaque primary visual area V1, higher visual area V4 and parietal control area 7a during attentional task performance. Top-down 7a-to-V1 beta-band influences enhanced visually driven V1-to-V4 gamma-band influences. This enhancement was spatially specific and largest for a beta-to-gamma delay of ~100 ms, suggesting a causal relationship. We propose that this cross-frequency interaction mechanistically subserves the attentional control of stimulus selection.
Many cognitive effects in vision can only be explained by invoking the concept of top-down influences (Gilbert & Sigman, 2007). For example, when top-down influences can pre-allocate attention to specific spatial locations, stimulus processing is more accurate and/or faster, even when stimuli remain unaltered. Correspondingly, neurons in early and mid-level visual cortex show enhanced firing rates when processing attended stimuli. These neurophysiological consequences of top-down control need to be mediated by corresponding projections. Indeed, anatomical studies have documented projections in the top-down direction that are at least as numerous as bottom-up projections. Bottomup and top-down projections show different characteristic laminar patterns of origin and termination, and the pattern of pairwise projections abides by a global hierarchy, in which each area occupies a particular hierarchical level (Felleman & Van Essen, 1991; Hilgetag, O’Neill, & Young, 1996; Markov et al., 2014). Recently, it has been shown that the pattern of anatomical projections is closely correlated to a pattern of frequency-specific directed inter-areal influences. Influences mediated by bottom-up projections are primarily carried by gamma-band synchronization in both macaque and human visual cortex; by contrast, influences mediated by top-down projections are primarily carried by alpha-betaband synchronization (Bastos, Vezoli, et al., 2015b; Michalareas et al., 2016; van Kerkoerle et al., 2014). A similar association of higher frequency bands with bottom-up and lower frequency bands with top-down projections has been found also in the auditory system (Fontolan, Morillon, LiegeoisChauvel, & Giraud, 2014) and the hippocampus (Bieri, Bobbitt, & Colgin, 2014; Colgin et al., 2009).
These relations between anatomy and physiology suggest that gamma rhythms are more involved in the bottom-up routing of stimulus information, whereas alpha-beta rhythms play a key role in modulating this process through top-down control (Bastos et al., 2012). In support of this, inter-areal gamma-band coherence is enhanced during tasks that involve primarily bottom-up stimulus routing (like pop-out), whereas inter-areal alpha-beta band coherence is enhanced during tasks that require top-down control (like visual search) (Buschman & Miller, 2007; Stein, vonStein, Chiang, König, & König, 2000). During sustained visual attention tasks, and probably also during much of natural viewing, top-down and bottom-up influences are both present simultaneously, with attentional topdown control persistently influencing bottom-up stimulus processing. The coexistence of both rhythms suggests the following scenario for attentional influences on visual processing: 1) Visual stimulation induces bottom-up gamma-band influences among visual areas; 2) Top-down beta-band influences are endogenously generated and thereby stimulus independent; 3) Attention enhances both, top-down beta-band and bottom-up gamma-band influences; 4) Cross-frequency interactions result in top-down beta-band influences enhancing subsequent bottom-up gamma-band influences in a spatially specific manner. There is substantial evidence for points 1) to 3), and this paper provides evidence for point 4). Regarding point 1): Gamma-band activity is induced by visual stimulation in lower visual areas (Gray, König, Engel, & Singer, 1989) and it entrains gamma in higher visual areas in a bottom-up manner (Bastos, Vezoli, et al., 2015b; Bosman et al., 2012; Grothe, Neitzel, Mandon, & Kreiter, 2012; Jia, Tanabe, & Kohn, 2013). Note that in higher visual areas, attention alone can induce gamma among local interneurons, and the projecting pyramidal cells join when visual stimulation commences (Vinck, Womelsdorf, Buffalo, Desimone, & Fries, 2013). Regarding point 2): Beta-band influences among visual areas are present before stimulus onset, and are typically stronger in the top-down direction (Bastos, Vezoli, et al., 2015b). Regarding point 3): Inter-areal gamma-band coherence and bottom-up gamma-band influences are enhanced by attention (Bosman et al., 2012; Grothe et al., 2012), and so are top-down beta-band influences (Bastos, Vezoli, et al., 2015b). In the present study, we will confirm points 1) – 3) for ECoG data recorded from two macaque monkeys, when they direct attention to the contralateral versus ipsilateral hemifield. To investigate point 4), we then analyze spontaneous fluctuations in top-down beta-band and bottom-up gamma-band influences and their mutual interplay. We find that on a moment-by-moment basis, top-down beta-band influences correlate positively with bottom-up gamma-band influences. Spontaneous enhancements in top-down beta-band influences were followed ~100 ms later by enhancements in bottom-up gamma-band influences, suggestive of a causal relation. Consistent with a role in selective attention, this effect was spatially specific, i.e. bottom-up gamma-band influences depended most strongly on the top-down beta-band influences, that were directed to the origin of the bottom-up influence.
RESULTS
Top-down versus Bottom-up Spectral Asymmetries and their Stimulus and Task Dependence
To assess the individual frequency bands for each monkey, we first computed the power spectral densities (PSD) during stimulation in two awake monkeys (monkey K and monkey P) for each regionof-interest (ROI) of two selected region-of-interest (ROI) pairs: 7A-V1 and V1-V4 (Fig. 1a, b). The ROI pair 7A-V1 was selected, because it constitutes a clear top-down pathway with documented projections from a very high-level control area to primary visual cortex (Bastos, Vezoli, et al., 2015b; Markov et al., 2014; Michalareas et al., 2016). The ROI pair V1-V4 was selected, because it constitutes a clear bottom-up pathway emerging from V1, i.e. the area targeted by the top-down 7A→V1 influence. For both ROI pairs, the ECoG provided good coverage. Area 7A shows strong betaband peaks in both monkeys (monkey K: ≈17 Hz; monkey P: ≈13 Hz) (Fig. 2e, f). Areas V1 and V4 show gamma frequency peaks (monkey K: ≈76 Hz; monkey P: ≈60 Hz) (Fig. 2a-d). Beta activity is visible in V4 and V1 of both monkeys at their matching peak frequencies found in area 7A. In area V4 of both monkeys, there are distinct beta peaks. In area V1, monkey K shows a distinct beta peak, and monkey P shows a shoulder in the power spectrum, at the respective beta frequency. We determined the dominant inter-areal communication frequencies for each monkey by calculating the pair-wise phase consistency (PPC), a frequency-resolved measure of synchronization (Vinck, van Wingerden, Womelsdorf, Fries, & Pennartz, 2010), between the V1-V4 and 7A-V1 ROI pairs (Fig. 2g-j). Gamma band synchronization was present for both ROI pairs in both monkeys with peaks at ≈76 Hz in monkey K and in a range of 58–65 Hz in monkey P. Beta peaks were present between both ROI pairs: at ≈17 Hz in monkey K and at ≈12 Hz in monkey P. Some of the power and PPC spectra showed also a theta-band peak, which is not further investigated, because the focus of this study is on the interaction between beta and gamma rhythms. For the further analyses, data from both monkeys were combined, by aligning their individual beta and gamma peaks ±10 Hz and averaging across monkeys.
To demonstrate that inter-areal gamma-band synchronization is stimulus driven (Bosman et al., 2012; Grothe et al., 2012), we contrasted PPC between the fixation and stimulation conditions. Fig. 3a, b shows significantly enhanced gamma-band synchronization between ROI pairs V1-V4 and 7A-V1 once the stimulus has appeared, in contrast to an almost flat spectrum when no stimulus is present. This finding is consistent with gamma-band oscillations occurring as a result of stimulus drive. In contrast, beta-band synchronization for both ROI pairs is present already during the pre-stimulus fixation period, suggesting an endogenous origin (Fig. 3a, b). Beta synchronization is maintained during the stimulation period, consistent with an ongoing top-down influence.
We next assessed the dominant directionality of interareal synchronization and its attentional modulation. We quantified directionality of synchronization by means of Granger causality (GC) (Bressler & Seth, 2011; Ding, Chen, & Bressler, 2006; Granger, 1969). As shown by Bastos et al. (2015b), and extended to humans by Michalareas et al. (2016), the top-down beta-band influence of area 7A to V1 is significantly greater than the bottom-up beta-band influence of V1 to 7A (Fig. 3e). This top-down beta-band influence is significantly increased when attention is directed to the visual hemifield contralateral to the recording grid (Fig. 3c), consistent with an earlier report (Bastos, Litvak, et al., 2015a). Between V1 and V4, the gamma-band influence is stronger in the bottom-up direction from area V1 to V4 (Fig. 3f). The bottom-up gamma-band influence of V1 to V4 was significantly increased with attention (Fig. 3d).
Top-down Beta Influences and Bottom-up Gamma Influences are Correlated Across Time
Endogenous increases in top-down beta GC may lead to an increase in stimulus-driven bottom-up gamma GC; therefore, we sought to determine whether the observed top-down beta GC and bottomup gamma GC are consistent with such a scenario. Quantification of correlation between moment-bymoment co-fluctuations in two GC influences is normally precluded by the fact that GC influences are not defined per single data epoch (without substantially sacrificing spectral resolution and/or signal-tonoise ratio). To surmount this problem, we used the recently developed method of Jackknife Correlation (JC), which quantifies the correlation by first calculating GC influences for all leave-one-out subsamples (i.e. the jackknife replications of all epochs) and then correlating these values (Richter, Thompson, Bosman, & Fries, 2015).
Here we used JC to correlate the top-down GC from a 7A site to a V1 site with the bottom-up GC from the same V1 site to a V4 site. We refer to these configurations of three sites as ‘triplets’, and the JC was determined for all possible 7A→V1→V4 triplets (N=10664, Monkey K: 3944, Monkey P=6720). The JC was calculated between all possible combinations of top-down frequencies and bottom-up frequencies, both ranging from 1-100 Hz. Fig. 4a shows the average over all triplets from each monkey averaged after alignment of their respective beta and gamma GC peaks. It reveals that topdown beta GC is correlated with bottom-up beta GC, and the same holds for the respective gamma GCs to a lesser extent. Crucially, top-down beta GC also shows a significant positive correlation with bottom-up gamma GC. The peak of this cross-frequency interaction is well aligned with the average 7A→V1 beta and V1→V4 gamma GC peak frequencies (Fig. 4a, line plots – intersection of the dashed lines). Importantly, there is no significant JC between 7A→V1 gamma and V1→V4 beta GC, even though 7A→V1 gamma GC is significantly correlated to V1→V4 gamma GC and 7A→V1 beta GC is significantly correlated to V1→V4 beta GC. Fig. 4b shows a specific triplet selected from monkey K with a sizeable correlation coefficient of rho(2653) = 0.16 between 7A→V1 beta and V1→V4 gamma GC. Again, the area of maximal correlation is well aligned with the top-down beta and bottom-up gamma peak frequencies for this particular triplet. Thus, though the average level of correlation across triplets is relatively low, specific triplet correlations may fall between small to moderate size (Cohen, 1988). We tested for differences in the correlation coefficient between topdown beta and bottom-up gamma GC for the contralateral and ipsilateral conditions separately (Fig. 4c). There was no significant difference between these conditions, consistent with a mechanism that exists under both attention conditions, such that attentional increases in top-down beta-band influences, as shown above, lead to increased bottom-up gamma-band influences, as we also show.
We next investigated whether the JC between 7A→V1 beta and V1→V4 gamma GC depended on involving the same V1 site, which would demonstrate spatial specificity at the level of recording sites. We tested this spatial specificity by pairing 7A→V1 beta GC to a specific V1 site, with V1→V4 gamma GC from a different V1 site, where the distance that separated the two V1 sites was parametrically varied. Fig. 5a shows for monkey K the boundaries of the recording grid and prominent sulci based on the monkey’s MRI and surgical photographs. For each V1 site, 5 sets of other V1 sites were defined that fell into pre-specified distance intervals (1 cm per interval, stepped by 2.5 mm, between 0 and 2 cm). Fig. 5a shows one example V1 site (arrow) and illustrates with 5 colored lines the five distance intervals (colored lines were slightly displaced for illustration purposes). The mean V1 distance for each distance interval is marked with a filled circle. Fig. 5b shows the resulting JC computed for a ± 5 Hz frequency window around the peak 7A→V1 beta and V1→V4 gamma GC frequencies averaged over triplets and monkeys. It can be seen that as the distance between the two V1 sites increases from zero, there is a monotonic falloff of the correlation coefficient between 7A→V1 beta and V1→V4 gamma GC. This indicates that top-down beta and bottom-up gamma GC influences are not global, but rather are spatially specific, such that the correlation is maximal when the top-down beta GC influence is targeting the same V1 region that is projecting the bottom-up gamma GC influence. This is not trivially explained by GC calculation using the same V1 site, as explained in more detail in the discussion section.
Top-down Beta Influences Lead Bottom-up Gamma Influences in Time
We have established that spontaneous fluctuations in endogenous top-down beta GC are correlated with fluctuations in stimulus-driven bottom-up gamma GC. To investigate whether the data contain evidence in support of a causal relation, we assessed whether top-down beta GC is predictive of subsequent bottom-up gamma GC. To accomplish this, we extended the JC by adding a temporal dimension not dissimilar from time-lagged cross-correlation. We compute the JC on time-frequency data, where we systematically offset the data by positive or negative lags. We call this procedure lagged jackknife correlation (LJC). This quantifies at what time delay between the top-down beta-band influence and the bottom-up gamma-band influence the JC between them is largest. We computed LJC for each triplet. Fig. 6 shows the LJC averaged over triplets and monkeys and exhibits a peak at 0.105 s indicating that top-down beta GC leads bottom-up gamma GC by 0.105 s (t(10663) = −7.576, p<<0.001, two-tailed jackknife-based t-test).
DISCUSSION
We used LFP recordings from 252-channel ECoG arrays covering large parts of the left hemispheres of two macaques to analyze the interaction between top-down and bottom-up GC influences. Topdown influences were quantified between area 7a at the top of the visual hierarchy (Bastos, Vezoli, et al., 2015b; Markov et al., 2014) and V1 at the bottom. Bottom-up GC influences were quantified between V1 and V4, a known feedforward pathway carrying stimulus driven input (Bosman et al., 2012). Inter-areal top-down influences showed a beta-band peak, that was independent of visual stimulation and therefore endogenously generated, that was significantly larger in the top-down than the bottom-up direction, and that increased with selective attention. Bottom-up influences showed a gamma-band peak, that was stimulus driven, that was significantly larger in the bottom-up than the top-down direction, and that also increased with selective attention. Jackknife Correlation between topdown beta-band influences and bottom-up gamma-band influences revealed a cross-frequency interaction. This interaction was spatially specific, as it was maximal between top-down and bottom-up inter-areal influences that shared the same V1 site. Finally, top-down beta-band influences best predicted bottom-up gamma-band influences ~100 ms later, suggesting that the cross-frequency interaction is causal.
There are potential concerns related to the influence of noise. Noise can affect GC and it could thereby in principle affect the JC between GC fluctuations that we analyze here. One relevant scenario concerns noise that is shared between two signals, which can lead to artifactual GC (Nalatore et al., 2007; Vinck et al., 2015). Shared noise is typically due to volume conduction, which is strongly attenuated in our signals due to the bipolar derivation (Trongnetrpunya et al., 2015). Furthermore, artifactual GC alone does not lead to JC between GC, because the latter requires correlated GC fluctuations. GC fluctuations could in principle be due to fluctuations in shared noise. However, such noise influences on the two GC metrics would occur simultaneously and therefore, the lagged JC would peak at zero lag, whereas we found a lag of ~100 ms. Other influences of noise can be envisaged, and with sufficiently complex assumptions on multiple noise sources, essentially any correlation can be explained. The set of observations presented here, namely the fact that GC is correlated between different frequency bands, that this correlation shows high spatial specificity and in particular that the peak correlation occurs at a lag, would require assumptions that appear extremely implausible.
Whereas noise is unlikely to explain the overall pattern of results, it might well influence the magnitude of observed correlations. The magnitude of correlations might be relevant to assess their functional significance. The JC for some triplets was at the level of rho ≈ 0.2, i.e. of moderate strength. However, the mean JC over all triplets tended towards a value an order of magnitude lower. This relatively low magnitude might be explained by uncorrelated noise. Uncorrelated physiological, measurement, and estimation noise will invariably lead to an underestimation of the true JC. These detrimental effects of noise are maximal for short data epochs and for correlations based on single trials, i.e. JC. Short data epochs, necessary to capture rapid fluctuations as the basis of our JC and LJC results, do not provide for the averaging out of stochastic physiological or measurement noise (Richter et al. (2015). JC across single trials, necessary for maximal sensitivity and fidelity, similarly reduces the correlation magnitude as compared to alternative approaches like sorting-and-binning that artificially inflate correlation magnitude (Richter et al. (2015).
Numerous studies in visual cortex have reported gamma-band synchronization within and between visual areas (Bichot, Rossi, & Desimone, 2005; Engel, Kreiter, König, & Singer, 1991; Fries, 2001; Fries, Roelfsema, Engel, König, & Singer, 1997; Gray & Singer, 1989; Hoogenboom, Schoffelen, Oostenveld, Parkes, & Fries, 2006; Kreiter & Singer, 1996; Tallon-Baudry, Bertrand, Delpuech, & Pernier, 1996; Taylor, Mandon, Freiwald, & Kreiter, 2005; Womelsdorf, Fries, Mitra, & Desimone, 2006; Wyart & Tallon-Baudry, 2008), and numerous studies in parietal cortex have reported beta-band synchronization within parietal areas and between parietal and frontal areas (Buschman & Miller, 2007; Dotson, Salazar, & Gray, 2014; Salazar, Dotson, Bressler, & Gray, 2012; Stetson & Andersen, 2014). Recent ECoG recordings covering both visual and parietal areas revealed that inter-areal betaband influences predominate in the top-down and inter-areal gamma-band influences predominate in the bottom-up direction (Bastos, Vezoli, et al., 2015b). These findings link parietal beta-band activity with visual gamma-band activity and suggest a concrete case of cross-frequency interaction (Bressler & Richter, 2015). In the present paper, we have tested some of the resulting predictions and found direct experimental support for such a cross-frequency interaction that allows top-down beta-band influences to enhance bottom-up gamma-band influences.
Cortical anatomy has revealed a distinct laminar pattern of top-down and bottom-up projections (Felleman & Van Essen, 1991; Markov et al., 2014). Bottom-up projections originate predominantly in superficial layers, and this predominance increases with the number of hierarchical levels bridged by the bottom-up projection. Furthermore, bottom-up projections terminate predominantly in layer 4. Topdown projections originate predominantly in deep layers, and this predominance increases with the number of hierarchical levels bridged by the top-down projection. Furthermore, top-down projections terminate predominantly outside layer 4, primarily in layers 1 and 6. Determining how the respective top-down influences interact with local processing and thereby ultimately with bottom-up influences remains a central neuroscientific quest. One potential mechanism has been proposed in a model that entails details of both layer-specific anatomy and cellular biophysics (Lee, Whittington, & Kopell, 2013), and that replicates effects of top-down selective attention on bottom-up gamma-band coherence. The model implicates a subclass of inhibitory interneurons, the slow-inhibitory (SI) interneurons, as targets of top-down modulation. These cells may span multiple cortical laminae and thus are suitably situated for integration of neuronal activity across layers. A subpopulation of these cells, low-threshold spiking (LTS) cells, are found in deep layers of the cortex and are: 1) hypothesized to receive top-down input, 2) implicated in the generation of beta oscillations and in the resonant response to beta-rhythmic top-down input and 3) selectively modulate gamma band activation in layer 2/3, leading to an enhanced gamma band output. Our present analysis confirms the central prediction of the Lee et al. (2013) paper, namely that specifically top-down beta-band influences enhance stimulus-driven gamma-band processes. Lee et al. show how this mechanism can support the implementation of attentional stimulus selection. The current results, which mechanistically link the previously reported attentional enhancements of top-down beta and bottom-up gamma influences, provide the hitherto missing experimental bridge. Together, experiments, modeling and model-testing data analysis have led to an intriguingly coherent understanding of the neuronal processes behind the implementation of attentional stimulus selection.
METHODS
Visual Stimulation and Behavioral Task
The experiment was approved by the ethics committee of the Radboud University Nijmegen (Nijmegen, The Netherlands). Two adult male macaque monkeys (monkey K and monkey P, both macaca mulatta) were used in this study. During experiments, monkeys were placed in a dimly lit booth facing a CRT monitor (120 Hz non-interlaced). When they touched a bar, a fixation point was presented, and gaze had to remain within the fixation window throughout the trial (monkey K: 0.85 deg radius, monkey P: 1 deg radius), otherwise the trial would be terminated and a new trial would commence. Once central fixation had been achieved and a subsequent 0.8 s pre-stimulus interval had elapsed, two isoluminant and isoeccentric drifting sinusoidal gratings were presented, one in each visual hemifield (diameter: 3 deg, spatial frequency: ~1 cycle/deg, drift velocity: ~1 deg/s, resulting temporal frequency: ~1 cycle/s, contrast: 100%). Blue and yellow tints were randomly assigned to each of the gratings on each trial (Fig. 1a). Following a random delay interval (monkey K : 1 - 1.5 s; monkey P : 0.8 - 1.3 s), the central fixation point changed color to match one of the drifting gratings, indicating that this grating was the target stimulus, i.e. the fixation point color was the attentional cue. When the target stimulus was positioned in the visual hemifield contralateral to the recorded hemisphere, we refer to this condition as attend contra, whereas when the target was in the ipsilateral hemifield with respect to the ECoG grid, this condition is labeled attend ipsi. Either the target or distracter stimulus could undergo a subtle change in shape consisting of a transient bending of the bars of the grating (0.15 s duration of the full bending cycle). This change could occur at any monitor refresh from 0.75 s to 5 s (monkey K), and 4 s (monkey P) after stimulus onset. Bar releases within 0.15 - 0.5 s after target changes were rewarded. If stimulus changes occurred before the cue indicated which stimulus was the target, reports were rewarded in a random half of trials. Bar releases after distracter changes terminated the trial without reward. Trials were pooled from both contra and ipsi conditions, except where explicit comparisons of these conditions were made.
Neurophysiological Recordings
LFP recordings were made via a 252 channel electrocorticographic grid (ECoG) implanted subdurally over the left hemisphere (Rubehn, Bosman, Oostenveld, Fries, & Stieglitz, 2009). Data from the same animals, partly overlapping with the data used here, have been used in several previous studies (Bastos, Litvak, et al., 2015a; Bastos, Vezoli, et al., 2015b; Bosman et al., 2012; Brunet et al., 2014; 2015; Lewis, Bosman, Womelsdorf, & Fries, 2016; Pinotsis et al., 2014; Richter et al., 2015). Recordings were sampled at approximately 32 kHz with a passband of 0.159 – 8000 Hz using a Neuralynx Digital Lynx system. The raw recordings were low-pass filtered to 250 Hz, and downsampled to 1 kHz. The electrodes were distributed over eight 32-channel headstages, and referenced against a silver wire implanted onto the dura overlying the opposite hemisphere. The electrodes were re-referenced via a bipolar scheme to achieve 1) greater signal localization 2) cancellation of the common reference, which could corrupt the validity of connectivity metrics, 3) to reject headstage specific noise. The bipolar derivation scheme subtracted the recordings from neighboring electrodes (spaced 2.5 mm) that shared a headstage, resulting in 218 bipolar derivations, henceforth referred to as “sites” (see Bastos et al. (2015b) for a detailed discussion of the rereferencing procedure). The site locations are shown as spheres in Fig. 1b (monkey K: white, monkey P: black).
Three ROIs were selected for the current study: V1, V4, and area 7A (referred to simply as “7A”). ROIs were defined based on comparison of the electrode locations (co-registered to each monkey’s anatomical MRI and warped to the F99 template brain in CARET (Van Essen, 2012), with multiple cortical atlases of the macaque (see Bastos et al. (2015b) for a detailed discussion). Recording sites composing each ROI were co-registered to a common template (INIA19, (Rohlfing et al., 2012)), as were the Paxinos ROI definitions (Paxinos, Huang, & Toga, 1999). The V1/V2 combined definition of Paxinos et al. (1999), is shown in Fig. 1b, 3g (red) for simplicity due to uncertainty across atlases of the V1/V2 border, though recording site selection was based on multiple atlases with no recording sites selected that were believed to belong to area V2. Based on these ROI definitions, 77 recording sites were selected from area V1 (monkey K: 29, monkey P: 48), 31 from area V4 (monkey K: 17, monkey P: 14), and 18 from area 7A (monkey K: 8, monkey P: 10).
Preprocessing and Spectral Analysis General
Signal processing was conducted using the FieldTrip toolbox (Oostenveld, Fries, Maris, & Schoffelen, 2011). The raw data was line noise rejected via the subtraction of 50, 100, and 150 Hz components fit to the data using a discrete Fourier transform. Following trial epoching, specific to each analysis, epochs for each site were de-meaned. Epochs exceeding 5 standard deviations of all data from the same site in the same session were rejected. In addition, epochs were manually inspected and epochs with artifacts were rejected. The remaining epochs were normalized by the variance across all data in all epochs from the same site in the same recording session. Subsequently, all epochs were combined across sessions.
Spectral analysis was performed via the fast Fourier transform (FFT) on 0.5 s epochs. For frequencies from 0-50 Hz, a Hann taper was utilized, whereas for frequencies above 50 Hz, the multitaper method (MTM) was used to improve the spectral concentration of the gamma rhythm (Percival & Walden, 1993; Thomson, 1982). We applied 5 tapers, resulting in a spectral smoothing of +/− 6 Hz. All epochs were zero-padded to 1 s resulting in a spectral resolution of 1 Hz. The coefficients resulting from the FFT were used to determine the cross spectral density, which is the basis for two connectivity metrics employed: pairwise phase consistency (PPC) (Vinck et al., 2010), and Granger causality (GC) (Bressler & Seth, 2011; Granger, 1969). When GC is computed from the cross-spectral density, this is known as a non-parametric approach in contrast to the traditional parametric method based on autoregressive modeling (Dhamala, Rangarajan, & Ding, 2008). Connectivity metrics were computed between all inter-areal pairings of sites between ROIs: V1-V4, and V1-7A.
High Resolution Spectral Comparisons
For the analyses of Fig. 2 and Fig. 3, we used all 0.5 s epochs that could be defined with 60% overlap. This overlap allows for the application of Welch’s method (Welch, 1967) and was selected as an optimal overlap for the multitaper method, while maintaining a reasonable computational load (Percival & Walden, 1993; Thomson, 1977) For the analysis of stimulation and fixation periods, each trial was segmented into a fixation and stimulation segment. The fixation period was defined as the 0.5 s prior to stimulus onset, while the stimulation period was defined from 0.3 s post-stimulus onset (to avoid onset related transients) until the first of three possible events 1) the onset of the cue, 2) a change in the distracter stimulus, or 3) a change in the target stimulus. This resulted in 6822 fixation epochs (Monkey K: 3384, Monkey P: 3438) and 13675 stimulation epochs (Monkey K: 8109, Monkey P: 5566). PSDs (Fig. 2a-f) were divided by 1/f to reduce the 1/f component. Fixation versus stimulation was only compared for PPC, which inherently controls for sample size bias, thus no adjustments needed to be made to the large disparity in the number of trials between the fixation and stimulation conditions. Statistical assessment of the within-subject difference between fixation and stimulation conditions employed a non-parametric technique (Maris & Oostenveld, 2007) that inherently controls for multiple comparisons, where epochs were pooled and then randomly distributed between the fixation and stimulation conditions. FFT was then performed, followed by PPC. The difference between conditions was then computed, with the maximum absolute difference across the frequencies retained. This critical step ensures that the statistical test is corrected for multiple comparisons across the frequency dimension. The procedure was repeated 1000 times, forming a distribution of surrogate values against which the empirical values at each frequency may be compared. Group statistics were performed on the peak-aligned data by first averaging over all site pairs within-subject, and then averaging these results across the monkeys. This ensured that each monkey contributed equally to the overall result, despite having different numbers of site pairs. This averaging procedure was performed for each permutation, followed by the same max-based multiple-comparison correction. This gives rise to a surrogate distribution of data against which the group data may be assessed. Multiple comparisons correction was performed across all frequencies tested (1-45 Hz and 55-100 Hz in steps of 1 Hz).
To compare attention to the hemifield contralateral (“attend contra”) versus ipsilateral (“attend ipsi”) to the recorded hemisphere, we defined the post-cue period as the time from 0.3 s after cue presentation (to avoid cue presentation transients) until the first change in either the target or distracter stimulus. This period was segmented into as many 0.5 s epochs as possible with 60 % overlap. This resulted in 8313 attend contra epochs (Monkey K: 3819, Monkey P: 4494) and 7899 attend ipsi epochs (Monkey K: 3456, Monkey P: 4443), i.e. a total of 16212 attend epochs (Monkey K: 7275, Monkey P: 8937). Non-parametric GC is known to be biased by sample size (Bastos & Schoffelen, 2015), thus the number of epochs per attention condition needed to be balanced for each monkey. This was accomplished by finding the condition with the fewest epochs, and randomly selecting this number of epochs from the other condition. The statistical difference between conditions was also assessed using a non-parametric statistical framework as described for the fixation versus stimulation contrast.
The comparison of top-down versus bottom-up GC was performed on the pooled data from the attend contra and ipsi conditions. Since this was a within-condition comparison, no balancing of epoch numbers was needed, and all epochs from both attend conditions were used. The statistical analysis of the difference between top-down and bottom-up GC could not be obtained using a non-parametric randomization framework, because top-down and bottom-up GC are not properties of specific sets of epochs, but rather are expressed by all trials simultaneously. Therefore, an alternative statistical approach was used, namely the bootstrap (Efron & Tibshirani, 1994). Like with the randomization approach, the statistic of interest – in this case the top-down/bottom-up GC difference – is recomputed on each bootstrap resample, giving rise to a distribution of surrogate values. Following Efron and Tibshirani (1994), a confidence interval can be constructed from the surrogate distribution. To assess the statistical significance at p=0.05 (two-tailed), we find the 2.5th and 97.5th percentile values from the surrogate distribution of differences between top-down and bottom-up GC. This naturally forms the 95% confidence interval such that if zero lies outside of this interval, we may conclude that the result is significant at a level of p=0.05. This method does not control for multiple comparisons, but we can easily modify it to do so using the same logic employed by Maris and Oostenveld (2007). We performed 1000 bootstrap resamples. For each resample we determined the absolute difference across frequencies between the bootstrap resample spectrum and the average of all bootstrap resamples, and retained the maximum of this value across frequencies. Thus we are guaranteed to form the largest confidence interval possible across frequencies and in so doing construct an omnibus confidence interval that controls for the multiple comparisons. This confidence interval is applied to each frequency, and where it does not contain zero, the result is significant at p=0.05. To conduct group level statistics, the omnibus statistic is derived from the mean of each bootstrap resample of the difference between top-down and bottom-up spectra across both monkeys (first averaged withinsubject across pairs), such that the mean of the empirical difference across the monkeys can be assessed for significance.
Jackknife Correlation
We aimed at quantifying the correlation between moment-by-moment co-fluctuations in two GC influences. This is normally precluded by the fact that GC influences are not defined per single data epoch (without substantially sacrificing spectral resolution and/or signal-to-noise ratio). Therefore, we used the Jackknife Correlation (JC), which quantifies the correlation by first calculating GC influences for all leave-one-out subsamples (i.e. the jackknife replications of all epochs) and then correlating these values (Richter et al., 2015). For each leave-one-out subsample, the GC or any other smooth function F of the data can be defined as follows: , where x specifies the recording site and j specifies the index of the left-out observation, here the epoch. JC requires independent epochs. Thus, we followed the same segmentation strategy as for the comparison of the attend conditions, but with zero overlap, which resulted in fewer epochs, totaling 6414 (Monkey K: 2655, Monkey P: 3759). Contra and ipsi conditions were combined for the JC analysis. The JC is defined using the following formula: , where n is defined as the number of jackknife replications and is equal to the total number of epochs, Fxj and Fyj are the jackknife replications, and are the means of the jackknife replications, and and are the standard deviations of the jackknife replications. To use the JC with the Spearman correlation metric, we applied the above formula on the ranks of Fxj and Fyj.
For statistical testing, we created a distribution of 1000 JC values under a realization of the null hypothesis of independence between 7A→V1 and V1→V4 GC influences. This was realized by calculating JC between randomly permuted jackknife replications of 7A→V1 and V1→V4 GC influences. This is equivalent to calculating the JC between GC influences after leaving out a random epoch for the 7A→V1 GC and a random epoch for the V1→V4 GC without replacement. To control for multiple comparisons across the frequency-frequency combinations, the max-based approach (see above) was again employed, where for each permutation the maximum absolute Spearman’s rho value was selected, giving rise to an omnibus distribution of surrogate correlation coefficients for each triplet. For maps showing the average correlation across triplets, this max-based method was performed on the mean over triplets, where for a given permutation each triplet was randomized in the same order as all others. This was done individually per monkey and after averaging across the two monkeys. When monkeys were combined, spectra were aligned to the betaand gamma-peak frequencies and averages were first taken across all triplets of each monkey to weight both monkeys equally.
For testing spatial specificity, we analyzed recording site triplets, which did not share the same V1 site (see main text for details): 7A→V1aV1b→V4. Since a vast number of such triplets exist, yet we wished to select a number equal to the original number of triplets to control potential statistical bias, we selected a unique number of 7A→V1aV1b→V4 triplets that matched the original number of 7A→V1→V4 triplets evaluated for each monkey and then the performed the same JC procedure. To smooth the result statistically (exploring the large space of possibilities), we repeated this procedure 100 times, resulting in unique selections of V1 sites for each distance interval, and averaged the outcomes. Results were plotted against the average distance obtained for each distance interval.
Lagged Jackknife Correlation (LJC)
We used the jackknife correlation (JC) to quantify the correlation between top-down beta GC and bottom-up gamma GC. To this end, we left out one data epoch at a time, calculated GC influences, and determined the correlation between the resulting GC influences across all leave-one-out (jackknife) replications. Since a given jackknife replication eliminated the same epoch for the calculation of both GC influences, this established the correlation at zero time lag. Next, we were interested in whether the correlation depends on the time lag. To test this, we computed JC between GC influences calculated from epochs offset by a variable lag. The epochs were stepped at t intervals of 5 ms. The offsets were stepped at →→ intervals of 5 ms. Note that stepping of intervals and offsets was in principle independent and could have been different, but it was chosen to be identical to speed up computation. We refer to this as lagged JC (LJC): τ was chosen to cover a range of lags from −500 ms to 500 ms. The GC calculation itself was as in the previous zero-lag JC, using 500 ms and the tapering specified above. We used data from 0.3 s postcue to 2 s post-cue, eliminating shorter trials so that longer lags could be tested (878 trials used, Monkey K: 398, Monkey P: 480). LJC was calculated across trials, i.e. leaving out an entire trial at a time (this is different from the previous zero-lag JC, which used multiple non-overlapping epochs per trial if available). The data that was available per trial allowed for multiple realizations of the two epochs with a particular lag. For each lag, LJC was calculated separately for all possible realizations and averaged. The number of possible realizations decreases as the lag between top-down beta GC and bottom-up gamma GC increases, resulting in fewer LJC computations that are averaged. This results in a noisier estimate at larger lags, but no systematic bias in the mean JC value. The number of epochs that each LJC is computed upon always equals the number of trials. Formally, this implementation of the LJC is defined as: , where m is the number of 500 ms windows, stepped at 5 ms, that fitted into the trial length of 1.7 s.
Statistical significance was assessed using the same logic as used for the JC, where the epoch order of one member of the JC was permuted with respect to the other. For the LJC, the permutation was identical for each time step and lag, to be conservative. Multiple comparison correction must take place over the multiple lags, which is achieved by taking the maximum absolute Spearman’s rho value across lags for each permutation. The resulting distribution is used to assess the probability that the empirical result at each lag occurred by chance. The empirical and the permutation metrics were first averaged over all triplets per monkey and then averaged over the two monkeys, to give equal weight to both subjects.
We wished to assess whether the LJC peak lag of −105 ms was significantly different from a lag of zero. We did so using a jackknife method to determine the standard error of the peak position in milliseconds (Efron, 1981). In this case we leave out a specific triplet to assess the variability of the peak. The jackknife procedure causes a compression of the variance (Richter et al., 2015), thus the 5 ms sampling grid would not be sufficient to represent the peak positions of the jackknife replications. To account for this, we cubic spline interpolated each replication to a resolution of 0.001 ms, which proved adequate to represent the variance of the peak. The peak of each jackknife replication was found using a Gaussian fit of the smoothed correlation as a function of lag (findpeaksG.m by T.C O’Haver). We then derived the standard error of the estimator, and converted this to a t-score by dividing the mean peak lag value of the jackknife replications by the estimated standard error. The significance of this t-value was then assessed against Student’s t-distribution. At the group level, this procedure entails concatenating the data from both monkeys, and leaving out each triplet once. Based on this group estimate of the standard error, a t-value is derived, as above, and assessed for statistical significance.
AUTHOR CONTRIBUTIONS
Conceptualization, C.G.R., W.H.T., C.A.B., and P.F.; Methodology, C.G.R., W.H.T., C.A.B., and P.F.; Software, C.G.R. and W.H.T.; Formal Analysis, C.G.R. and W.H.T.; Investigation, C.A.B. and P.F.; Writing – Original Draft, C.G.R. and P.F.; Writing – Review & Editing, C.G.R., W.H.T., C.A.B., and P.F.; Visualization, C.G.R.; Supervision, P.F.; Funding Acquisition, P.F.
ACKNOWLEDGMENTS
We would like to thank J. Vezoli for his assistance in ROI definition, co-registration and construction of anatomical renderings. This work was supported by DFG (SPP 1665, FOR 1847, FR2557/5-1-CORNET), EU (HEALTH-F2-2008-200728-BrainSynch, FP7-604102-HBP), a European Young Investigator Award, National Institutes of Health (1U54MH091657-WU-Minn-Consortium-HCP), the LOEWE program (NeFF) and an NWO-FLAG-ERA Joint Transnational Call 2015 (CAB).
Footnotes
↵5 Co-first authors