Abstract
Our behavior entails a flexible and context-sensitive interplay between brain areas to integrate information according to goal-directed requirements. However, the neural mechanisms governing the entrainment of functionally specialized brain areas remain poorly understood. In particular, the question arises whether observed changes in the regional activity for different cognitive conditions are explained by modifications of the inputs or recurrent connectivity? We observe that fMRI transitions over successive time points convey information about the task performed by 19 subjects, namely watching a movie as opposed to a black screen (rest). We use a theoretical framework that decomposes this spatiotemporal functional connectivity pattern into local variability received by the 66 cortical regions and recurrent effective connectivity between them. We find that, among the estimated model parameters, movie viewing affects to a larger extent the local excitabilities, which we interpret as extrinsic changes related to the increased stimulus load. However, detailed changes in the effective connectivity preserve a balance in the propagating activity and select specific pathways so as to integrate visual and auditory information to high-level brain regions and across the two brain hemispheres. These findings speak to a dynamic coordination underlying the functional integration in the brain.
1 Introduction
The brain comprises a large number of functionally distinct areas in which information and computational processes are both segregated and integrated [1, 2]. A fundamental question in system neuroscience is how information can be processed in a distributed fashion by the neuronal architecture. Brain regions exhibit a high degree of functional diversity, with a massive number of connections that coordinate their activity. Accordingly, empirical evidence from functional magnetic resonance imaging (fMRI), electro-encephalography (EEG), magneto-encephalography (MEG) in humans, as well as cell recordings in animals, supports the notion that brain functions involve multiple brain areas [3]. Long-range synchronization of brain activity has been proposed as a dynamical mechanism for mediating the interactions between distant neuronal populations at the cellular level [4, 5], as well as within large-scale cortical subnetworks both at rest [6, 7, 8, 9] and when performing a task [10, 11].
Depending on the task, cortical dynamics reshape the global pattern of correlated activity observed using neuroimaging - denoted by functional connectivity (FC) [9, 12]. Presumably, both sensory-driven and cognitive-driven processes are involved in shaping FC from its resting state [13, 14]. Recently, the temporal aspect of fMRI signals has been much studied - in relation to tasks performed by subjects or their behavioral conditions - via the concept of ‘dynamic FC’ that evaluates the fluctuations of fMRI correlation patterns over time [15], the fractal aspect of fMRI time series [16, 17] or transitions in fMRI activity across successive TRs [18]. The present study builds upon a recently developed whole-cortex dynamic model [19], which extract this functionally relevant information via the BOLD covariances evaluated with and without time shifts, which relates to BOLD transition statistics.
The proposed modeling allows us to examine the respective roles played by the local variability of each brain area and long-range neuro-anatomical projections between them in shaping the cortical communication, which results in the measured FC. We rely on the well-established hypothesis that both the activity and coordination of different regions depend on both the local activity and intracortical connectivity [20]. Based on dynamic models for blood oxygen level dependent (BOLD) activity at the level of a cortical region, techniques have been developed to estimate the connectivity strengths: the notion of ‘effective connectivity’ (EC) describes causal pairwise interactions at the network level [21, 22, 23, 24]. The distinction between functional and effective connectivities is crucial here: FC is defined as the statistical dependence between distant neuro-physiological activities, whereas EC is defined as the influence one neural system exerts over another [25]. In the present study, the definition of EC is actually close to its original formulation in neurophysiology [26]: estimated weights in a circuit diagram that replicate observed patterns of functional connectivity. Importantly, the network effective connectivity is inferred for individual links here (1180 connections between 66 cortical regions) and thus form a directed graph, contrasting with previous studies that directly use structural connectomes as “equivalent EC” [27, 28, 29].
After describing the changes observed in empirical FC between subjects at rest and watching a movie, we examine whether these FC alterations are well captured by the proposed model. Considering the network parameter estimates as fingerprints of the brain dynamics, we seek a mechanistic explanation for the observed FC changes by disentangling significant changes in local variability and in intracortical connectivity.
2 Material and Methods
2.1 Study design for empirical fMRI data during rest and passive movie viewing
As detailed in our previous papers [30, 31], 24 right-handed young, healthy volunteers (15 females, 20-31 years old) participated in the study. They were informed about the experimental procedures, which were approved by the Ethics Committee of the Chieti University, and signed a written informed consent. Only 22 participants had recordings for both a resting state with eyes opened and a natural viewing condition; 2 subjects with only recording at rest were discarded. In the resting state, participants fixated a red target with a diameter of 0.3 visual degrees on a black screen. In the natural viewing condition, subjects watched and listened to 30 minutes of the movie ‘The Good, the Bad and the Ugly’ in a window of 24 × 10.2 visual degrees. Visual stimuli were projected on a translucent screen using an LCD projector, and viewed by the participants through a mirror tilted by 45 degrees. Auditory stimuli were delivered using MR-compatible headphones.
2.2 Data acquisition
Functional imaging was performed with a 3T MR scanner (Achieva; Philips Medical Systems, Best, The Netherlands) at the Institute for Advanced Biomedical Technologies in Chieti, Italy. The functional images were obtained using T2*-weighted echo-planar images (EPI) with BOLD contrast using SENSE imaging. EPIs comprised of 32 axial slices acquired in ascending order and covering the entire brain (230 x 230 in-plane matrix, TR/TE = 2 s/3.5 s, flip angle = 90°, voxel size = 2.875 × 2.875 × 3.5 mm3). For each subject, 2 and 3 scanning runs of 10 minutes duration were acquired for resting state and natural viewing, respectively. Only the first 2 movie scans are used here, to have the same number of time points for the two conditions (i.e., 20 minutes each). Each run included 5 dummy volumes - allowing the MRI signal to reach steady state and an additional 300 functional volumes that were used for analysis. Eye position was monitored during scanning using a pupil-corneal reection system at 120 Hz (Iscan, Burlington, MA, USA). A three-dimensional high-resolution T1-weighted image, for anatomical reference, was acquired using an MP-RAGE sequence (TR/TE = 8.1 s/3.7 s, voxel size = 0.938 × 0.938 × 1 mm3) at the end of the scanning session.
2.3 Data processing
Data were preprocessed using SPM8 (Wellcome Department of Cognitive Neurology, London, UK) running under MATLAB (The Mathworks, Natick, MA). The preprocessing steps involved: (1) correction for slice-timing differences (2) correction of head-motion across functional images, (3) coregistration of the anatomical image and the mean functional image, and (4) spatial normalization of all images to a standard stereotaxic space (Montreal Neurological Institute, MNI) with a voxel size of 3 × 3 × 3 mm3. The mean frame wise displacement [32] was measured from the fMRI data to estimate head movements. They do not show any significant difference across the rest and movie recordings (p > 0.4). Furthermore, the BOLD time series in MNI space were subjected to spatial independent component analysis (ICA) for the identification and removal of artifacts related to blood pulsation, head movement and instrumental spikes [33]. This BOLD artifact removal procedure was performed by means of the GIFT toolbox (Medical Image Analysis Lab, University of New Mexico). No global signal regression or spatial smoothing was applied. For each recording session (subject and run), we extracted the mean BOLD time series from the N = 66 regions of interest (ROIs) of the brain atlas used in [34]; see Table 1 for the complete list.
2.4 Structural connectivity
Anatomical connectivity was estimated from Diffusion Spectrum Imaging (DSI) data collected in five healthy right-handed male participants [34, 23]. The gray matter was first parcellated into the N = 66 ROIs, using the same low-resolution atlas used for the FC analysis. For each subject, we performed white matter tractography between pairs of cortical areas to estimate a neuro-anatomical connectivity matrix. In our method, the DSI values are only used to determine the skeleton: a binary matrix of structural connectivity (SC) obtained by averaging the matrices over subjects and applying a threshold for the existence of connections. The strengths of individual intracortical connections do not come from DSI values, but are optimized as explained below. It is known that DSI underestimates inter-hemispheric connections [34]. Homotopic connections between mirrored left and right ROIs are important in order to model whole-cortex BOLD activity [29]. Here we add all such possible homotopic connections, which are tuned during the optimization as other existing connections. This slightly increases the density of structural connectivity (SC) from 27% to 28%.
2.5 Empirical functional connectivity
For each of the two sessions of 10 minutes of rest and movie, the BOLD time series is denoted by sit for each region 1 ≤ i ≤ N with time indexed by 1 ≤ t ≤ T (T = 300 time points separated by a TR = 2 seconds). We denote by the mean signal: for all i. Following [19], the empirical FC corresponds to covariances calculated as:
For each individual and session, we calculate the time constant τx associated with the exponential decay of the autocovariance averaged over all ROIs:
This is used to “calibrate” the model, before its optimization. Similar calculations are done for 2 TR.
2.6 Model of cortical dynamics
The model uses two sets of parameters to generate the spatiotemporal FC:
the local variability embodied in the matrix Σ inputed individually to each of the N = 66 ROIs (see Table 1 for the complete list) or jointly to ROI pairs (only for bilateral CUN, PCAL, ST and TT);
the network effective connectivity between these ROIs embodied by the matrix C, whose skeleton is determined by DTI (see details for structural connectivity above).
The rationale behind the use of spatially cross-correlated inputs (off-diagonal elements of Σ) in the model is to take into account for common sensory inputs to homotopic visual and auditory ROIs. Ideally, the model should be extended to incorporate subcortical areas and the existence of input cross-correlations inputs should be evaluated for all ROI pairs. However, this level of details is out of the scope of the present work and we constrain such input cross-correlations to 4 pairs of ROIs. Another point concerns the use of individual EC skeletons or refinements of SC using graph theory for individual groups [35], but we leave this for later work.
Formally, the network model is a multivariate Ornstein-Uhlenbeck process, where the activity variable xi of node i decays exponentially with time constant τx - estimated using Eq. (2) - and evolves depending on the activity of other populations: Here, dBi is spatially colored noise with covariance matrix Σ, with the variances of the random fluctuations on the diagonal and cross-correlated inputs corresponding to off-diagonal elements for CUN, PCAL, ST and TT (see Table 1). In the model, all variables xi have zero mean and their spatiotemporal covariances Qijτ, where τ indicates time shift, can be calculated by solving the Lyapunov equation: JQ0 + Q0J† + Σ = 0 for τ = 0; and then Qτ = Q0expm(J†τ) for τ > 0. Here J is the Jacobian of the dynamical system and depends on the time constant τx and the network effective connectivity: where δij is the Kronecker delta and the superscript † denotes the matrix transpose; expm denotes the matrix exponential. In practice, we use two time shifts: τ = 0 on the one hand and either τ = 1 or 2 TR on the other hand, as this is sufficient to characterize the network parameters.
2.7 Parameter estimation procedure
We tune the model such that its covariance matrices Q0 and Qτ reproduce the empirical FC, namely and with τ being either 1 or 2 TR. The uniqueness of this estimation follows from the bijective mapping from the model parameters C and Σ to the FC pair (FC0,FC1). Despite the estimation of input cross-correlation, the essential steps are similar to the iterative optimization procedure described previously [19] to tune the network parameters C and Σ. At each step, the Jacobian J is calculated from the current value of C. Then, the model FC matrices Q0 and Qτ are calculated from the consistency equations, using the Bartels-Stewart algorithm to solve the Lyapunov equation. The difference matrices and determine the model error which is the matrix distance between the model and the data observables. The desired Jacobian update is the matrix ΔJ† = (Q0)−1[ΔQ0 + ΔQ1expm(J†τ)], which decreases the model error E at each optimization step, similar to a gradient descent. The best fit corresponds to the minimum of E. Finally, the connectivity update is ΔCij = ηC ΔJij for existing connections only; other weights are forced at 0. We also impose non-negativity for the EC values during the optimization. To take properly the effect of cross-correlated inputs into account, we adjust the Σ update from the heuristic update in [19]: ΔΣ = −ηΣ(J ΔQ0 + ΔQ0J†). As with weights for non-existing connections, Σ elements distinct from the diagonal and cross-correlated inputs are kept equal to 0 at all times. In numerical simulations, we use ηC = 0.0005 and ηΣ = 0.05.
To verify the robustness of the optimization with respect to the choice for ROIs with (spatially) cross-correlated inputs, we compared the tuned models with input cross-correlation for 1) CUN, PCAL, ST and TT; 2) CUN, PCAL, LING, LOCC, ST, TT and MT; 3) none. Although detailed estimates differ, the results presented in this paper are qualitatively observed for all three models. In practice, the model compensates the absence of input cross-correlations by overestimating the connections between the corresponding ROIs. For simplicity, we only consider such inputs for putative primary sensory ROIs involved in the task here.
The optimization code is available with the empirical data on github.com/MatthieuGilson/EC_estimation. The discarded subjects in the present study are 1, 11 and 19, among the 22 subjects available (numbered from 0 to 21).
2.8 Normalized statistical scores and effective drive (ED)
We define the following z-scores for X being C or Σ with respect to the whole distribution over all connections/ROIs and subjects as where mean and std correspond to the mean and standard deviation over subjects for the considered matrix element, while lX is the median of all relevant non-zero elements of C or Σ, as illustrated by the dashed-dotted line in Fig. 5A. We also define the effective drive as with the corresponding median lED. It measures how the fluctuating activity at region j with amplitude corresponding to the standard deviation propagates to region i.
2.9 Louvain community detection method
We identify communities in networks based on the modularity of a partition of the network [36]. The modularity measures the excess of connections between ROIs compared to the expected values estimated from the sum of incoming and outgoing weights for the nodes (targets and sources, respectively). The Louvain method [37] iteratively aggregates ROIs to maximize the modularity of a partition of the ROIs in community. Designed for large networks, it performs a stochastic optimization, so we repeat the detection 10 times for each subject in practice and calculate the average participation index - in the same community - for each pair of ROIs over the subjects and 10 trials for each of the two conditions (rest and movie).
To test the significance of the differences between the estimated communities of each condition, we generate 1000 surrogate communities where the conditions are chosen randomly with equal chance for each subject. This gives a null distribution of 1000 participation indices for each pair of ROIs, whose upper 5% tail is used to determine significance.
3 Results
3.1 Changes in spatiotemporal FC between rest and movie viewing
We re-analyzed BOLD imaging data already reported recorded in 22 healthy volunteers when watching either a black screen - referred to as rest - or a movie (2 sessions of 10 minutes for each condition). Here these signals are aggregated according to a parcellation of N = 66 cortical regions, or regions of interest (ROIs), listed in Table 1. Firstly, we examine the changes in BOLD statistics up to the second-order between the two conditions, since these functional observables are typically used to tune whole-brain dynamic models: BOLD correlations [27, 28, 29] and time-shifted covariances [19]. Doing so, we address the question of which statistics of the BOLD time series discriminate between the two behavioral conditions. As shown in Fig. 1A, the BOLD signals do not exhibit consistent changes in their means (circles) between rest and movie at the subject level. In contrast, the BOLD variances (squares) increase by about 50% on average when watching the movie; the black lines indicate a perfect match. The right panel of Fig. 1A displays time constants τx (triangles) estimated from BOLD autocovariance functions. They indicate the “memory depth” of the corresponding time series, quantifying how much the BOLD activity at a given TR influences the successive TRs; see Eq. (2) in Methods. Here no significant change of temporal statistics, unlike reduction of long-range temporal correlations measured by the Hurst exponent [16]. From the plots in Fig. 1A, we discard three individuals (in red) with extreme values: two for the variances (excessive variance for movie) and one for τx (small values for both conditions). From the original 22 subjects, this leaves 19 for the following analysis.
Now considering ROIs individually and the variability of the BOLD means and variances over the subjects in Fig. 1B, we observe as before significant changes only for the variances in some ROIs (blue crosses). Considering BOLD covariances for pairs of ROIs, we calculate for each matrix element the significance for each matrix element using Welch's t-test: both FC0 with zero time shift and FC1 with a shift of 1 TR are displayed in Fig. 1C; see Eq. (1) in Methods for a formal definition of FC. As a comparison, we also show the BOLD correlations in Fig. 1D: distinct matrix elements appear to be the most significant, but the comparison of the corresponding p-value distributions in Fig. 1E shows that variances (in cyan) discriminate between between rest and movie, followed by correlations (black), then FC0 (dark blue) and finally FC1 (green).
3.2 The noise-diffusion network model captures well the changes in spatiotem-poral FC across conditions
In order to make sense of the collective changes observed in the spatiotem-poral FC and move beyond a phenomenological description, the present studies draws upon our recent modeling study for resting-state fMRI data [19]. The dynamic network model aims to reproduce the empirical BOLD covariances, both with and without time shift. This generative model is schematically represented in Fig. 2 with only a few cortical regions in the diagrams, while the matrices involve all N = 66 ROIs that cover the whole cortex. Fig. 2A shows the structural connectivity (SC), which is determined by DSI data, measuring the density of white-matter fibers between the ROIs; gray pixels indicate homotopic connections that are added post-hoc, as explained in Methods. The model comprises two sets of parameters: local variability corresponding to the input covariance matrix Σ (purple noisy inputs in Fig. 2B) and recurrent effective connectivity (EC) between ROIs (matrix C with directional connections represented by the uneven red arrows). The skeleton of EC is determined by SC, assuming the existence of connections in both directions; the weights for absent connections are always zero. Here we include input cross-correlations for homotopic regions (anti-diagonal of Σ) in the visual and auditory ROIs: CUN, PCAL, ST and TT (see Table 1). The rationale is to account for binocular visual and binaural inputs related to the movie stimulus, whose corresponding strengths are estimated as other parameters. The fluctuating activity of each ROI due to the input covariance Σ is shaped by the recurrent EC to generate the network pattern of correlated activity. The latter is measured by the pair of covariance matrices FC0 and FC1 (see Fig. 2C).
The parameters for existing connections in C and input (co)variances in Σ are optimized iteratively such that the model FC0 and FC1 best fit their empirical counterparts, as illustrated in Fig. 2C. In practice, the model is calibrated by the estimated time constants τx in Fig. 1B, a single value for all ROIs per each subject and condition. This choice is motivated by our previous results for resting-state fMRI where no difference across ROI time constants was observed. An improvement would consist in estimating individual values for each ROI/subject/condition, but this is left for later work. From an initial homogeneous diagonal matrix Σ and effective connectivity C = 0, each optimization step aims to reduce the model error, defined as the matrix distance between the model and empirical FC0, plus the same matrix distance for FC1. The best fit corresponds to the minimum of the model error, which gives the estimated C and Σ for each subject and condition. In summary, the model inversion explains the observed spatiotemporal FC by means of Σ and C.
The precision of the estimated parameters is limited by the number of time points in the BOLD signals, but this procedure unambiguously retrieves the model parameters for accurate empirical FC0 and FC1 observables. The iterative approach provides an advantage compared to multivariate autoregressive models applied directly to the data: it enhances the robustness of the estimation by reducing the number of estimated parameters (absent connections are kept equal to 0) and imposing constraints (non-negativity for C and Σ). Importantly, the model optimization takes network effects into account: EC weights are tuned together such that their joint update best drives the model toward the empirical FC matrices. It is also worth noting that we only retain information about the existence of connections from DSI; individual DSI values do not influence the corresponding values in C. In practice, EC directionality depends mainly on the time-shifted covariance FC1. Further details about the model and the optimization procedure are given in Methods.
The qualitative fit of the model is displayed in Fig. 3A (left panel) for FC0 and a single subject at rest. Quantified by the Pearson correlation coefficients between the model and empirical FC matrix elements, the model goodness of fit is summarized in the right panel of Fig. 3A for all subjects and the two conditions, which is very good for almost all cases with plotted values larger than 0.7 [29]. Importantly, we verify that the model captures the change in FC between the two conditions, as illustrated in Fig. 3B: the left panel provides the example for a subject and the right panel the summary for all subjects, as in Fig. 3A. Once again, the Pearson correlation between the model and empirical ΔFC (movie minus rest) is larger than 0.6 for most subjects. Moreover, the parametric p-values for the changes in FC0 matrix elements are in good agreement with their empirical counterparts in Fig. 3C, with an overall Pearson correlation coefficient of 0.8 with p < 10−10. Only elements corresponding to absent EC connections (in black) are not in good agreement; correcting SC with the addition of missing edges could improve this aspect, but this requires individual DSI data instead of the generic SC used here. To further verify the robustness of estimated parameters, we repeat the same estimation procedure using FC0 and FC2 with a time shift of 2 TR instead of FC0 and FC1 (with a 1 TR) as done so far. We found nearly identical Σ estimates and very similar C estimates (Fig. 3D), which agrees with our previous results for resting-state fMRI data [19].
To finally characterize how the model parameters respectively capture the FC statistics, we compare in Fig. 3E the model error for the four models combining the model estimates for the two conditions, rest and movie. The model error corresponding to the movie FC is decomposed into three components: the FC0 variances (on the matrix diagonal) and covariances (off-diagonal elements), as well as FC1 elements. The horizontal dotted line indicates the error for the M/M model, with the two estimates from the movie data. When the EC is changed for rest (M/R model), the fit for FC0 variances and covariances become worse, but only dramatically for a few subjects. However, the FC1 fit is worsen for many subjects. In contrast, changing Σ from movie to rest (M/R model) is particularly dramatic for FC0 variances; it also increases the error - compared to M/M - for FC0 covariances and FC1 for all subjects. Last, the R/R model with both rest estimates appears worse for FC1 and equally bad for FC0 elements. This illustrates that C and Σ are combined in shaping FC, so observed changes in FC requires a proper model inversion to interpret its “causes”, here local variability and network connectivity. In particular, a phenomenological analysis of empirical FC0 is not sufficient to estimate the change in cortical interactions, in the limit of the proposed dynamic model.
3.3 Movie viewing induces greater changes in local variability than network effective connectivity
From Fig. 3, we conclude that the tuned model satisfactorily captures the changes in empirical FC between rest and movie. Now we examine how the estimated parameters discriminate between the two conditions, to verify whether C and Σ are useful fingerprints for the cortical dynamics. Fig. 4A displays the global distributions for C and Σ over all subjects: the Kolmogorov-Smirnov distance between the rest and movie distributions is 0.20 for Σ, to be compared with only 0.04 for C. At the level of individual parameters, Fig. 4B display the significance (same parametric t-test as in Fig. 1) for the changes in C (in bright red) and Σ (purple) across conditions: local variability is more affected by movie viewing than EC. The significance for changes in the sum of incoming and outgoing weights is also plotted: it shows that outgoing connections (thin solid dark red curves) exhibit more significant changes than incoming ones (thin dotted line), as well as connections taken individually. From Fig. 4C, changes in Σ are mainly increases, whereas those for C are distributed around 0.
We now examine in Fig. 4D which connections and ROIs experience the most significant changes in parameters. For Σ (middle panel), the significance limit with p < 0.01 (uncorrected) is displayed in dotted line, while the dashed line corresponds to a Bonferroni correction with family-wise error rate equal to 0.05 (that is, p < 0.05/m with total number of parameters m = N + 4 = 70 for Σ). We identify 5 parameters that pass the Bonferroni thershold, which all concern the bilateral ST and LOCC ROIs (7% of all parameters, in bright red); 4 more ROIs (in darker red) pass the uncorrected threshold (13% in total). In contrast, 4 EC connections pass the Bonferroni threshold (0.3%) and 82 more the uncorrected threshold (7%). These changes concern 55 ROIs among the 66; moreover, 31 are EC increases (including the 4 passing the Bonferroni threshold) versus 55 decreases. For outgoing weights, only the right LOCC shows a significant increases passing the Bonferroni threshold (2%) and 8 more ROIs are above the uncorrected threshold (12%). This also raises the issue of comparing the significance of EC changes (Bonferroni correction with m = 1180 EC parameters) versus that for Σ (m = 70 parameters, more than one order of magnitude lower). Non-parametric permutation testing points to the same ROIs - with slightly higher significance - but this does not solve the problem of family-wise error control. This motivates a complementary analysis based on network dynamics and graph theory (now with non-parametric significance testing) to interpret the changes in EC.
3.4 Dynamical balance in the integration of visual and auditory inputs
The power of our model-based approach lies in disentangling local from network contributions to the observed changes in FC. LOCC belongs to the visual cortex - albeit not the primary visual cortex - and ST hosts the primary auditory cortex. Therefore we interpret the increase of local variability for those sensory ROIs as an extrinsic increase of the stimulus load in the movie viewing condition. Interestingly, changes in BOLD variances that also pass the Bonferroni thershold - namely rIP, rFUS and lBSTS - are not straightforwardly explained by a corresponding change in the local variability Σ in Fig. 4D. This suggests that their increase could arise instead from the propagation of activity from other ROIs, as a network effect. In other words, even though changes in local activity between rest and movie are stronger both in magnitude and in significance, they do not explain all changes for FC.
To address this question, we examine the propagation of sensory information from the visual and auditory bilateral ROIs in our parcellation: CUN, PCAL, LOCC and LING on the one hand; ST, TT and MT on the other hand. We also focus on the above-mentioned FUS and BSTS that are known to be involved downstream visual and auditory processing [38, 39]. Fig. 5A shows the SC density between those visual (located in the lower left side and indicated by the red bars), auditory (upper right side in blue) and so-called ‘integration’ ROIs (center in purple). The dark pixels along the diagonal of the SC matrix hints at the hierarchy from visual ROIs (‘VIS’) and auditory ROIs (‘AUD’) to integration ROIs (‘INT’), corresponding to the solid arrows in the diagram on the top. In addition, fewer direct connections between VIS and AUD those on the matrix sides (dotted arrow in the diagram).
The estimated EC weights in Fig. 5B suggest that - for both rest and movie - direct anatomical connections between VIS and AUD are not “used” in practice, but fluctuating activity propagates between VIS and AUD via INT, back and forth. Note that plotted values in Fig. 5B corresponds to z-scores averaged over the subjects and normalized over the distribution for all connections, see Eq. 3 in Methods and Fig. 4A with the median value indicated by the dashed-dotted line. To further quantify this hierarchical propagation, we use the effective drive (ED) that is a canonical measure for the noise-diffusion network used here. As illustrated in the left diagram, it measures the amount of fluctuating activity at ROI j (standard deviation of BOLD signal where is the model variance on the diagonal of FC0) that is sent to ROI i (multiplied by the EC weight Cij) and contributes to its activity. Although Fig. 5B shows a picture globally similar for rest and movie, the difference of ED z-scores in Fig. 5C indicates an increase to FUS from all VIS ROIs, as well as an increase from PCAL and LOCC to LING and LOCC. In addition, LOCC sends stronger activity to PCAL and CUN. In the auditory side, ST increases its effect on TT, MT, BSTS and FUS. Together, most increases are along the diagonal of the hierarchical integration mentioned above and the bridge between VIS and AUD is mainly ensured by LOCC, FUS, BSTS and MT.
Thanks to our model-based approach, we can decompose the change in ED into two components related to the changes in C and Σ between rest and movie. If we retain only the increase in local variability Σ for the movie condition, ED increases almost everywhere, and in a particularly large amount for direct connections from ST to visual ROIs (left panel); note that visual ROIs also increase their direct effect on ST. However, negative changes in C nullifies this increase, as shown in the right panel. In addition, increased C values boost EC along the diagonal. This means that the mixed positive and negative changes in C select pathways to preserve the hierarchical integration of sensory influx. To check whether this yin-yang effect is significant, we calculate the asymmetry between the left and right matrices in Fig. 5C; in practice, this is given by the scalar product of the vector obtained by stacking the matrix columns, normalized by the total ED changes in absolute value (matrix in center). The asymmetry corresponding to the 18 bilateral ROIs in Fig. 5C is represented by a diamond in Fig. 5D and compared to a surrogate distribution for the same number of randomly chosen ROIs, while preserving the hemispheric symmetry. The significance for the observed asymmetry in the VIS-INT-AUD subnetwork is p = 0.04 and the 104 surrogate values are distributed around zero, confirming that this yin-yang effect does not come artificially from particular properties of the model, but from the estimated parameters instead.
3.5 Path selection in the whole cortical network
Now we analyze the ED changes in a more global manner to measure the effect of integration path selection at the whole-cortex level. Using the Louvain method from graph theory [36, 37], we estimate communities with higher-than-chance exchange of fluctuating activity between them, as measured by ED. We perform community analysis for each subject in each condition and pool the results over the subjects to obtain a participation index of ROI pairs, which measures the probability for them to be in the same community. To test the significance of these, we repeat the same procedure 103 times while mixing the labels (rest and movie) among the subjects to generate surrogate participation indices. This gives an individual null distribution of 103 values per ROI pair; details are provided in Methods.
Fig. 6A displays increases and decreases (dark pixels) of participation indices for all ROI pairs in movie compared to rest. For illustration purpose, the ROIs grouped into 6 groups: somatosensory-motor (SMT), frontal (FRNT) and so-called ‘central’ ROIs (CTRL) in addition to the visual, auditory and integration regions examined in Fig. 5; note that the integration group includes more ROIs than before, see the Table 1. Most decreases concern ROIs in the same hemi- sphere (bottom left and top right in the right panel), whereas increases involve interhemispheric ROI pairs (top left and bottom right in the left panel). These increases especially concern AUD, INT and FRNT. Last, interactions between VIS and INT strongly increase both within and between hemispheres.
To understand the changes in ED from a feedforward-feedback perspective, Fig. 6B summarizes the percentage of increase and decrease compared to rest between the groups, where the two hemispheres are taken together. The largest increases concern VIS and AUD internally, as well as feedforward projections from VIS to INT. Globally, all contributions from VIS, AUD and INT increase, at the noticeable exception of direct interactions between AUD and VIS. Once again, the feedback to VIS and AUD comes from INT. Fig. 6C summarizes the changes: At rest, AUD is strongly tied to the INT, SMT and part of FRNT. This cluster is decoupled in the movie condition such that part of INT binds to VIS. Meanwhile, INT remains linked to FRNT, whose interhemispheric interactions are strongly boosted. This underlines a selective coordination of cortical paths to implement a distributed and hierarchical processing of sensory information.
4 Discussion
Our results shed light on a fundamental question in neuroscience: how do inputs and connectivity locally interact to generate large-scale integration of information in the brain? To address this question, we use a recently developed model-based approach to provide an interpretation of task-evoked BOLD activity in terms of cortical communication, by decomposing it into local variability and network effective connectivity. The main finding of this study concerns the reorganization of the cortical connectivity during movie viewing: although changes in EC appear at first sight smaller than those in local variability, they induce strong changes in communication across the cortex.
First, they are involved in a down-regulation of forward connections in a compensatory manner, such that regional inputs do not saturate the network; meanwhile, some specific backward connections are boosted to enable the efficient transmission, such as top-down signals from integration to sensory areas, despite the activity increase of the latter (Fig. 5C). The dynamic balance is expected to be task dependent - in regard of extrinsic stimulus inputs - and result in complex patterns of changes in functional synchronization at the network level (Fig. 1C). Our results are in line with previous studies and suggest a continuum of the balanced-activity principle from the neuronal level [40] to the cortical level [41].
Second, specific pathways are actually selected almost everywhere in the cortex to shape the integration of sensory information. Our results suggest a reorganization of the functional communities in terms of propagation of BOLD activity (Fig. 6A): during the movie condition, homotopic areas increase their information exchange via inter-hemispheric connections - especially parietal and temporal areas related to multimodal integration [42, 39], as well as frontal areas. This illustrates how the cortex becomes specialized when engaging a task, while specific high-level ROIs remains rather more stable and keep listening the whole cortex. Because of the movie-viewing task considered here and its stark difference compared to the rest condition, we observe increases of BOLD variances, presumably related to the increase in stimulus load and similar to previous MEG measurements for the same experiment [11]. Carefully designed experiments that controlled for the change in stimuli showed instead a decrease of variance when a subject engages a visual recognition task, as opposed to passive viewing [16]. In line with other studies that point to decreased brain interactions during task [43, 44], the reconfiguration compared to the resting state may consist in shutting down more pathways than opening new ones (cf. balanced EC changes in Fig. 4C). This could explain the reduction of global cortical activity, once the stimulus-related changes are taken into account.
Beyond the task analyzed here, our study demonstrates that spatiotemporal BOLD (co)variances convey important information about the cognitive state of subjects, as was previously reported [16, 17]. Our spatiotemporal FC corresponds to transitions of fMRI activity between successive TRs; this statistics is averaged over the whole recording period, in contrast to other time-dependent measures such as inter-subject correlations [45], metastability [46] or measures of dynamic FC averaged over 1-2 minutes, corresponding to periods of more than 30 TRs [15]. This was already suggested by our previous analysis of resting state [19] and is in line with recent results that focused on the temporal component of BOLD signals [47, 18]. Moving beyond the analysis of spatial FC, namely covariances without time shift (FC0) or BOLD correlations, is thus a crucial step toward a better interpretation of fMRI measurements. The proposed model uses an exponential approximation of BOLD autocovariance (locally over a few TRs) and discards slow-frequency variations. In contrast, previous studies highlighted that, when considering a broader spectrum with slow frequencies, BOLD time series have long-range temporal interactions [48, 16, 49]. This multifractal property of BOLD signals has been analyzed to describe undirected interactions between ROIs [17]; it would be interesting to compare them to the directed effective connectivity estimated here, which is left for future work.
Following resting-state studies [6, 7, 8, 9], the focus on second-order BOLD statistics allows for stepping from a structure-centric [50] to a network-oriented analysis. The input noise in Σ play a functional role in our model, which we interpret as spontaneous activity. The comparison of Σ across conditions allows for a quantification of intrinsic and extrinsic local activity for each ROI [13, 14]. Taking the intrinsic noisy nature of brain circuits [51, 52, 53] properly into account in models is a challenge. Therefore, it has been done for models with complex local dynamics such as dynamic causal model (DCM) only for small number of ROIs itinially [54], even though recent efforts aim to extend it to the whole brain. Here the proposed modeling makes use of the putative diffusion of this noisy activity in the cortical network via the long-range projections to tune the model and interpret the model estimates. In a way, it extends analyses based on partial correlations that measure undirected interactions [44]. Unlike dynamic causal modeling (DCM), we do not model self connectivity within ROIs, but directly estimate the amplitude of random fluctuations within each region and how it is transferred via the effective drive (Fig. 5).
Although it does not focus on the mechanisms underlying the dynamic regulation of EC [55, 54], our model provides a fingerprint of the brain dynamics that we expect to be discriminative between tasks and behavioral conditions, as shown here for passive vision and audition. We stress the importance of considering the whole cortex - or better the whole brain with subcortical regions - to generate the estimated fingerprint: changes across conditions reported in Fig. 6 concerns many areas distributed all over the brain. At this level, C and Σ lie in a high dimensional space (one per connection and about one per node, respectively), so statistical analysis of the estimated parameters across conditions may suffer from an approach based on family-wise error correction (e.g., Bonferroni). Graph theory is then a useful complement, as was done here using community analysis.
Acknowledgements
This work was supported by the Human Brain Project (grant FP7-FET-ICT-604102 and H2020-720270 HBP SGA1 to MG and GD) and the Marie Sklodowska-Curie Action (grant H2020-MSCA-656547 to MG). This work was partly supported by the 7th Framework Programme of the European Commission (grant PCIG12-334039 to DM) and the KU Leuven Special Research Fund (grant C16/15/070 to DM). VB was supported by a Post-Doctoral Fellowship grant from the University of Chieti.
References
- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵