Abstract
There is growing interest in the rich temporal and spectral properties of the brain's functional connectome that are provided by Electro- and Magnetoencephalography (EEG/MEG). However, the problem of leakage between brain sources that arises when reconstructing brain activity from EEG/MEG recordings outside the head makes it difficult to distinguish true connections from spurious connections, even when connections are based on measures that ignore zero-lag dependencies. In particular, standard anatomical parcellations for potential cortical sources tend to over- or under-sample the real spatial resolution of EEG/MEG. By using information from the cross-talk functions (CTFs) that objectively describe leakage for a given sensor configuration and distributed source reconstruction method, we introduce methods for optimising the number of regions of interest (ROIs) while simultaneously minimising the leakage between them. More specifically, we compare two image segmentation algorithms: 1) a split-and-merge (SaM) algorithm based on standard anatomical parcellations and 2) a region growing (RG) algorithm based on all the brain vertices with no prior parcellation. Interestingly, when applied to minimum-norm reconstructions of data from 102 magnetometers, 204 planar gradiometers and 70 EEG sensors, both algorithms yielded approximately 70 ROIs despite their different starting points, suggesting that this reflects the resolution limit of this particular sensor configuration and reconstruction method. Importantly, when compared against standard anatomical parcellations, we found significant improvements in both sensitivity and distinguishability of the ROIs. Furthermore, by simulating a realistic connectome with a single hub, we show that the choice of parcellation can have significant impact on the outcome of graph theoretical analysis of the source-reconstructed EEG/MEG. Thus, CTF-informed, adaptive parcellations allow a more accurate reconstruction of functional connectomes from EEG/MEG data.
1 Introduction
Connectivity analyses of source estimated Electro- and Magnetoencephalography (EEG/MEG) can provide a millisecond-by-millisecond map of functional and effective interactions (Bastos & Schoffelen 2016; Greenblatt et al. 2012) among multiple brain areas in resting state as well as during task performance (Brookes et al. 2016; Colclough et al. 2016; Palva et al. 2010). Consequently, there has been growing interest in reconstructing the human brain connectome to obtain time- and frequency-resolved whole-brain networks (Palva & Palva 2012). Studies on structural and functional MRI connectomics have revealed important properties of the brain in health and disease, particularly concerning changes in “hubs” and the associated “rich club” of highly-connected regions (Bullmore & Sporns 2009; Crossley et al. 2014; van den Heuvel & Sporns 2011). The growing field of EEG/MEG connectomics is anticipated to take this approach further by vastly increasing the temporal and spectral resolution of the human connectome (Brookes et al. 2011; de Pasquale et al. 2010). However, the spatial resolution of EEG/MEG data is seriously limited, because several thousand sources of activation in the brain must be estimated from maximally a few hundred sensor recordings.
The limited spatial resolution causes the so-called leakage or cross-talk problem for linear and linearly constrained distributed EEG/MEG source estimation: activity estimated in one region of interest (ROI) can be affected by leakage from locations outside this ROI, possibly including locations at large distances (Lachaux et al. 1999; Schoffelen & Gross 2009; Hauk et al. 2011). This poses serious challenges for the interpretation of connectivity results, since increased connectivity between two ROIs may not only be caused by true connections between the time courses of these ROIs, but also by signals leaked into these ROIs from other brain locations, thus leading to spurious connectivity findings (Colclough et al. 2015). This is particularly important for the estimation of whole-brain connectivity and applications of graph theoretical measures. For example, one ROI in a network may be identified as a hub (i.e. showing strong connections to several other ROIs) if it receives strong leakage from multiple other ROIs.
Most previous EEG/MEG studies have adopted parcellations from structural or fMRI research for whole-brain connectivity analysis (Colclough et al. 2016; Brookes et al. 2016; Tewarie et al. 2016). Some studies have orthogonalised source-reconstructed timeseries across ROIs, in order to remove any zero-lag correlation, such as that induced by leakage (Brookes et al. 2012; Hipp et al. 2012; Colclough et al. 2015). While this may be suitable if connectivity is estimated from more slowly-varying amplitude envelopes of ongoing oscillatory activity, it also potentially removes true zero-lag connectivity that is not an artefact of cross-talk. Additionally, considering the spatial resolution of EEG/MEG, anatomical parcellations may not be optimal and recent studies have suggested that EEG/MEG-based parcellations can be more informative (Brookes et al. 2016). The ideal parcellation should be sensitive to as much of the cortex as possible, with each ROI having high sensitivity to activity arising from itself, and low leakage from other ROIs. CTFs can be used to characterise leakage among different brain areas (Liu et al. 1998; Hauk et al. 2011). Some previous studies have suggested using CTFs to minimise leakage between a small number of ROIs. Wakeman (2013), for example, sub-selected a number of vertices as representative for each of a few ROIs that had minimal cross-talk with the other ROIs, while Hauk and Stenroos (2014) proposed a method that optimises spatial filters for source reconstruction in order to produce zero cross-talk among a small set of brain sources and minimal cross-talk from other sources.
While these methods are optimised for the case of few spatially distinct sources, their extension to whole-brain connectivity analysis is limited. Palva et al. (2010) introduced a parcellation for graph theoretical analysis of single subject data by taking into account the source-sensor geometry of EEG/MEG. They used a clustering algorithm to parcellate the cortex into 365 (i.e. equal to the number of sensors) patches, based on phase synchrony patterns estimated from simulated data generated from white noise in source space. Korhonen et al. (2014) introduced sparse weights to collapse the source space based on the forward and inverse modelling of simulated noise in the source space. Their method aims at assigning optimum vertices to a fixed set of ROIs and extracting the ROI time course as a weighted sum of the assigned vertices. This method utilises phase coherence between the true and estimated sources in order to maximise the fidelity of assigned vertices to the recipient ROI. Unlike the aforementioned Palva et al.’s method (2010) the sparse weights approach is suitable for group as well as single subject analysis and is based on the anatomical parcellations. The sparse weights approach provides a novel way of extracting ROI time courses based on the spatial limitations of EEG/MEG, however, obtaining an adaptive parcellation that can optimise both the number and location of ROIs, as well as vertex selection within those ROIs, with respect to EEG/MEG spatial limitations has remained a challenge (Korhonen et al. 2014; Bullmore & Bassett 2011).
Here, we utilise CTFs as a direct measure of spatial leakage to address the aforementioned limitations systematically. For this purpose, we have implemented two CTF-informed image segmentation algorithms (Gonzalez & Woods 2007) that parcellate the cortical surface into the maximum number of distinguishable ROIs. In the first approach, we have started from standard anatomical parcellations and modified the ROIs using a CTF-informed split-and-merge (SaM) algorithm. The main idea is to merge ROIs that produce highly overlapping CTFs, split ROIs that produce distinguishable patterns of cross-talks, remove ROIs to which EEG/MEG show low sensitivity, and for each ROI identify a group of representative vertices that show high sensitivity and specificity to that particular ROI as compared to the rest of the brain. This approach is suitable for studies that require a particular anatomical labelling of ROIs. In the second approach, we start from all the brain vertices with no prior parcellation. A CTF-informed region growing algorithm is used to create ROIs around the vertices that show highest sensitivity and specificity of CTFs on the cortex. These ROIs are then optimised with respect to specificity and sensitivity using an SaM algorithm. This approach should prove useful for studies where no strict anatomical labels are required.
Both algorithms yield adaptive parcellations since CTF patterns may change depending on the choice of head models, inverse operators, measurement configurations (i.e. EEG, MEG or their combination) and signal-to-noise ratios (SNR) of the data. Additionally, the proposed algorithms can use data from multiple subjects and yield parcellations suitable for group analysis through morphing the cortical surfaces from single subjects to a standard average space (e.g. MNI space). We evaluate the performance of the proposed algorithms by measuring the sensitivity and specificity of the CTFs of the final ROIs to themselves as compared to the rest of the brain, and comparing performance to those of two standard structural atlases in the Freesurfer software (Desikan-Killiany (Desikan et al. 2006) and Destrieux (Destrieux et al. 2010)). We further validate the performance of our approaches for spectral connectivity and graph theoretical analyses of simulated event-related data with realistic levels of noise in source space. We show that an EEG/MEG-adaptive parcellation results in a more accurate network reconstruction for both zero-lag and non-zero-lag connectivity metrics.
2 Theory
2.1 EEG/MEG source estimation and spatial resolution
In this section we introduce the concepts of the resolution matrix and cross-talk functions, which are the basis for the parcellation algorithms described in later Methods section.
2.1.1 EEG/MEG forward and inverse solution
In forward modelling of EEG/MEG data, assuming a linear relationship between data and sources, the leadfield matrix (G) maps the dipolar sources of activity on the cortex to the electric and magnetic signals measured using EEG and MEG sensors (Hämäläinen & llmoniemi. 1994). Therefore, signal at each sensor is modelled as a weighted sum of the activities of all the sources in the brain: where Y is an Nch × Nt matrix of the measured signal at the sensor locations, the time-invariant matrix G denotes the leadfield of size Nch×Ns and S denotes the source activity matrix which is of size Ns × Nt (Nch : number of recording channels, Nt : number of time points, Ns : number of sources/vertices/voxels).
For EEG/MEG, linear source estimation methods are often employed in order to obtain a solution for S in Equation 1, i.e. if D is the matrix of the measured data (which contains activity from brain sources in Equation 1 plus noise), the source activity is estimated as: where W is the inverse operator of size Ns×Nch that maps measurements to the sources, Ŝ is the matrix of estimated sources of size Ns × Nt and denotes the measurement noise matrix of size Nch × Nt and R is the resolution matrix.
2.1.2 Resolution matrix and CTFs
In Equation 2, the resolution matrix R = WG can be used to quantify the relationship between true and estimated sources. The diagonal elements of R indicate the sensitivity of each estimated source to itself, and off-diagonal elements quantify the degree to which estimated sources are affected by the signal from all other sources in the brain (Grave De Peralta Menendez et al. 1997; Liu et al. 1998). An accurate estimation of source activity in the brain is only possible if G is a full-ranked square matrix (i.e. equal number of sensors and sources) and in the absence of measurement noise. In such an ideal scenario W would be the inverse of G, R = G -1G = I would be an identity matrix and the estimated sources would precisely match the true sources. However, the EEG/MEG inverse problem is a highly underdetermined problem and the resolution matrix has non-zero off-diagonal elements. These off-diagonal elements introduce the leakage or cross-talk in the EEG/MEG inverse solutions.
More specifically, the ith row of R describes the cross-talk from all sources in the brain into the estimate for activity of the ith source. These rows have therefore been called cross-talk function (CTFs) (Liu et al. 1998; Hauk et al. 2011). Therefore, the cross-talk that the ith source receives from the jth source is defined as: where n is the number of sources/vertices in the brain. As explained above, ideally Rij should be 0 for any i≠j and 1 for i=j. If an element Rij is zero, there is no cross-talk from the jth source into the estimate for the ith source. If two CTFs are largely non-overlapping, this means they are sensitive to different areas of the brain. If Rij is much larger than the value of Rik (k being a third source in the brain), this means that the estimator is more prone to receive cross-talk from the jth source than from the kth source. Note that a CTF is necessarily a linear combination of the leadfields (i.e. rows of G). Therefore, CTFs cannot be designed to take on any arbitrary shape, but are constrained by the measurement configuration. Therefore, CTFs offer a direct way of quantifying the cross-talk problem for linear estimation of a given measurement configuration, which can be used to find an optimal parcellation of the source space based on objective criteria.
2.1.3 Using CTFs to modify structural atlases
Two main problems can arise from utilising anatomical parcellations with EEG/MEG, which we illustrate in Fig. 1:
1) Sensitivity Problem: EEG/MEG might not be sensitive to activity from some ROIs:
a. While for superficial ROIs CTFs may peak within the ROI (e.g. Supramarginal Gyrus, Fig. 1a left), deeper ROIs may receive much larger cross-talk from areas close to the sensors than from themselves (e.g. Insula, Fig. 1a right).
2) Specificity Problem: Structural boundaries might not correspond to the spatial resolution of EEG/MEG:
a. Large ROIs may be split into sub-regions with distinguishable CTFs (e.g. postcentral gyrus, Fig. 1b).
b. Some distinct anatomical ROIs may produce highly similar CTFs, and are therefore indistinguishable from one another due to the limited spatial resolution or EEG/MEG measurements (e.g. Pars Orbitalis and Pars Triangularis, Fig. 1c).
The examples in Fig. 1 also highlight the usefulness of CTFs for the evaluation - and possible construction - of cortical parcellations for EEG/MEG connectivity analysis.
2.1.4 Both zero-lag and non-zero-lag connectivity are affected by leakage
Signal leakage causes activity in one area to be estimated in nearby areas with no time delay; thus there will be zero-lag phase difference between the actual activity and the “leaked” activity (Brookes et al. 2012; Hipp et al. 2012). Therefore, connectivity methods that are insensitive to zero-lag correlations such as phase lag index (PLI) or imaginary part of coherency (ImCOH), have been suggested to overcome the leakage problem to some extent (Stam et al. 2007; Nolte et al. 2004). Here we show that even though insensitivity to the zero-lag connections can alleviate the problem, non-zero-lag methods are still affected by leakage.
The principle of this problem is illustrated using CTFs in Fig. 1d. Let us consider a case where activity in rostral middle frontal (RMF) cortex and middle temporal gyrus (MTG) show non-zero-lag connectivity. In an ideal scenario with no leakage, the whole-brain seed-based connectivity with seed in the RMF should only produce connectivity with MTG (blue area in the Fig. 1d). However, in a realistic scenario with leakage, two outcomes are possible: 1) If a connectivity measure which is sensitive to zero-lag connections such as Pearson Correlation or Coherence is used, high connectivity will be found between the active sources as well as their leakage domain (Fig. 1d middle); 2) If a non- zero-lag connectivity measure such as imCOH is used, the spurious connectivity between RMF seed and its surrounding areas (i.e. RMF “realm”) will be resolved but results will still be affected by the “blurring” around the MTG source (Fig. 1d right). This is due to the fact that the whole neighbourhood of MTG is in non-zero-lag connection to the RMF. It is worth noting that the same argument can be brought for the bivariate directed connectivity methods such as Granger Causality (GC); i.e. if RMF Granger-causes activity in MTG, it will show spurious GC to the neighbourhood of the MTG too. However, generalisation to the multivariate connectivity methods is less straightforward which will be discussed in Appendix A.
3 Materials and Methods
3.1 EEG/MEG data acquisition and pre-processing
We used real datasets collected from 17 healthy subjects who participated in an event-related visual word recognition experiment to obtain head-models and noise covariance matrices of pre-stimulus baseline intervals for source estimation. EEG and MEG data were acquired at the MRC Cognition and Brain Sciences Unit, Cambridge, UK, using a Neuromag Vectorview system (Elekta AB, Stockholm, Sweden), which contained 204 planar gradiometers, 102 magnetometers, and a 70-channel EEG cap (EasyCap GmbH, Herrsching, Germany). Individual structural T1 MRI scans were acquired using a 3T Siemens Tim Trio scanner at the MRC Cognition and Brain Sciences Unit, using a 3D MPRAGE sequence. A 3Space Isotrak II System (Polhemus, Colchester, Vermont, USA) was used to digitise the positions of 5 Head Position Indicator (HPI) coils that were attached to the EEG cap, 3 anatomical landmark points (left and right ears and nasion), and 50-100 additional points, in order to ensure an accurate co-registration with MRI data. The pre-processing steps for EEG/MEG data (used for the computation of noise covariance matrices) included Neuromag maxfilter (Version 2.0), bad channel interpolation, band-pass filtering between 1-48Hz and ICA for EOG and ECG artefact removals. MRI preprocessing was performed in the Freesurfer software (Version 5.3; http://surfer.nmr.mgh.harvard.edu/) and EEG/MEG analyses were performed in the MNE python software package (version 0.9) http://martinos.org/mne/stable/mne-python.html).
3.2 Head model and source estimation
Boundary element models (BEMs) were derived from structural MRIs for each subject. 50-100 digitised additional points on the scalp surface were matched with the reconstructed scalp surface from the FreeSurfer software in order to co-register EEG/MEG sensor configurations with MRIs. FreeSurfer was used for segmentation and the results were further processed using MNE software package (Version 2.7.3). The original cortical surface (consisting of more than 160,000 vertices) was down-sampled to a tessellated grid where the average edge of each triangle was approximately 2.5mm, resulting in 20484 vertices in the downsampled cortex (Segonne et al. 2004). A three-layer BEM consisting of 5120 triangles per layer was created from combined EEG/MEG from scalp, outer skull surface and inner skull surface respectively. The noise covariance matrices for each dataset were computed and regularised in a single framework which computes the covariance using empirical, diagonal and shrinkage techniques and selects the best fitting model by log-likelihood and three-fold cross-validation on unseen data (Engemann & Gramfort 2015). Baseline intervals of 500 ms duration pre-stimulus were used for the estimation of noise covariance matrices. The resulting regularised noise covariance matrices were used to assemble the inverse operators for each subject using an L2 minimum norm (MNE) estimator with loose orientation constraint 0.2 and no depth weighting.
3.3 EEG/MEG-adaptive parcellations
We used two CTF-informed image segmentation algorithms (Gonzalez & Woods 2007) to parcellate the cortical surface. In the first approach, starting from standard structural parcellations, we applied a modified split and merge (SaM) algorithm to the CTFs. In the second approach, we start from all brain vertices with no prior parcellation and use CTFs together with a region growing algorithm to create ROIs and a SaM algorithm to modify the created ROIs. A flowchart of different steps is shown in Fig. 2.
3.3.1 Leakage and ROI resolution matrices (RRmat)
As an initial step, we defined an ROI Resolution matrix (RRmat) and used it in addition to the original resolution matrix (R) to quantify the leakage patterns as the building block for the parcellation algorithms. RRmat describes normalised cross-talks among ROIs (rather than vertices for the original resolution matrix R in section 2.1.2). RRmats were computed in the following steps:
First, the unsigned CTFs (i.e. absolute values) of each ROI at all the brain vertices (hereafter referred to as rvCTF) are computed by taking the first principal component of the CTFs of all vertices within those ROIs. This yields an Nvx × NROI matrix were columns quantify the leakage of each ROI at the vertices in the brain. where x denotes the xth vertex, Ki is the number of vertices for ROIi, ri are the columns of the resolution matrix R, USYT denotes singular value decomposition, T is the matrix of spatial principal components and T(1) represents the first principal component that explains the maximum variance of the data. This procedure reduces matrix R of size nxn to R’ of size nxN, where N is the number of ROIs.
Second, we define the ROI Resolution matrix (RRmat), which quantifies the normalised leakage that each ROI receives from all other ROIs, where RRmatij describes leakage from ROIi, to ROIj: where i, j show the ith, jth ROIs out of N ROIs in the brain, x denotes vertices inside ROIj, Kj is the number of vertices inside ROIj . Note that normalisation of the RRmat is done so that we can obtain the relative influence of each ROI on any vertex as compared to the rest of ROIs in the brain. Considering the use of SVD for ROI-to-vertex resolution matrix which can yield different scales for different ROIs, this final normalisation ensures that RRmat values are limited between zero and one.
As pointed out before, an ideal RRmat is an identity matrix and our purpose is to obtain parcellations for which the similarities between the actual and an ideal RRmat are maximised.
3.3.2 A CTF and neuro-anatomy based split-and-merge segmentation algorithm
We examined both the Desikan-Killiany Atlas (68 ROIs) and Destrieux Atlas (148 ROIs), in order to observe the effect of the initial ROI size. Thereafter, we modified the structural ROIs based on CTFs using an algorithm similar to split-and-merge algorithm in digital image processing literature (Haralick & Shapiro. 1985; Gonzalez & Woods 2007). Split-and-merge algorithms, e.g. used for image segmentation, typically start from a whole image and utilise an iterative process to divide the image into as many “homogeneous” segments as possible. The homogeneity criterion is defined based on the image properties, such as constant standard deviation inside a segment. If the homogeneity criterion is not satisfied inside a segment, that segment will be split into several equal-sized sub-segments and the homogeneity criterion will be checked inside each of these new segments and the same procedure is iterated until no further splitting is possible. At this point, the merging procedure starts where homogenous segments will be merged using some predefined criterion (e.g. pixel colour or intensity) in an iterative procedure until no more merging is possible.
Here, we have adapted a similar idea together with the limitations imposed by the CTFs to define the split, merge and homogeneity criteria. As described in the theory section, on the one hand, if an ROI is too large it will produce several separate CTF patterns. On the other hand, if CTFs of two ROIs overlap substantially, those ROIs cannot be distinguished using EEG/MEG (Fig. 1b, c). Additionally, if an ROI is located in deeper structures of the brain it is likely showing low sensitivity to the recorded EEG/MEG signals (Fig. 1a). Therefore, as will be elaborated in the following sub-sections, we have defined split, merge and homogeneity criteria to parcellate the cortex into as many distinguishable ROIs as possible by assigning constraints on the CTFs, R and RRmat.
3.3.2.1 Splitting criterion
1. As the first step, we identified the structural ROIs that are too big (e.g. like Fig. 1b) and split them into sub-ROIs. We used RROIi.(Equation 5) in order to determine whether or not a single ROI is producing several distinguishable CTFs. An ROI was split alongside its longest axis to a number of sub-ROIs based on the number of the principal components that explained 90% of the variance of its RROIi. In order to have a fixed number of sub-ROIs across hemispheres in one subject as well as across subjects in the experiment, we used the following two steps:
1.1 To obtain consistency across hemispheres, if EL eigenvectors were required to explain 90% of the variance of the RROIi of a particular ROI1 in the left hemisphere and ER eigenvectors for the mirror ROI1 in the right hemisphere, the minimum of EL and ER was assigned to both left and right ROI1 in order to assure no over-splitting for smaller ROIs;
1.2 To obtain consistency across subjects, the mode of the number of eigenvalues across subjects (i.e. the number of sub-ROIs that was found for the majority of subjects) was assigned to that ROI.
3.3.2.2 Homogeneity criterion
2. The second step was to assign the vertices to ROIs. Each of the vertices in the brain was assigned to only one split ROI or no split ROIs. A vertex was assigned to an ROI only if it was: firstly, sensitive to that ROI (sensitivity) and secondly, significantly more sensitive to that ROI compared to all other ROIs in the brain (specificity).
2.1 To satisfy the sensitivity condition, we removed the vertices that were not very sensitive to any ROIs: For every vertex, we tested for every ROI whether the ROI’s rvCTF value at this vertex was equal or more than half of the maximum of the ROI’s rvCTF values anywhere in the brain. If this was the case, that vertex was considered sensitive to that ROI. If a vertex was not sensitive to any ROIs in the brain it was removed from further analysis.
2.2 To evaluate the specificity criterion, the values of the CTFs of ROIs at each vertex were converted to z-scores: where x denotes a single vertex in the brain, rvCTFix is the value of the CTF of the ith ROI at vertex x, N is the number of ROIs in the brain and σrvCTFx is the standard deviation across rvCTF values from all ROIs at vertex x. Based on these z-scores, we classified vertices into one of three categories:
2.2.1 Declined vertices: If no ROIs showed a z-score above 3 for a vertex, it indicated that the vertex was equally influencing several ROIs and hence was not specifically sensitive to the rvCTFs of any of the ROIs. These vertices were removed from further analysis.
2.2.2 Assigned vertices: Using a winner-takes-all approach, if an ROI indicated the highest z-score above 3 for a vertex and the z-score was at least 1 standard deviation higher than the runner-up ROI, that vertex was assigned to the winner ROI.
2.2.3 Merge candidate vertices: If the difference between the z-scores of the winner and runner-up ROIs for a vertex were less than one standard deviation, those vertices were marked for the merging procedure (see sub-section 3.3.2.3 below).
3.3.2.3 Merging criterion
3. Based on the above condition for merging candidate vertices, a group of vertices that showed sensitivity to two particular ROIs were clustered together as a new “merged” ROI. All of the new merged ROIs that were equal-sized or bigger than the smallest original split ROI in the brain, were kept in the “merged ROIs” list for further analysis and otherwise removed. For example, vertices that were equally sensitive to both superior temporal and middle temporal gyri were clustered as a new ROI superior-temporal_middle-temporal. The merging of vertices can result from two scenarios: first, if two original split ROIs are too finely separated and not distinguishable using EEG/MEG (e.g. like ROIs in Fig. 1c), they will completely merge together. Second, if some ROIs partially overlapped, a third region may emerge from that overlapping region.
These split, homogeneity and merging procedures yielded a modified ROI list consisting of the original “split ROIs” and the new “merged ROIs”.
3.3.2.4 Final homogeneity evaluation
4. Step 2 described above was repeated for the modified list of the split and merged ROIs and ROIs that could win at least 10 vertices in the brain were kept and the rest of ROIs were removed.
5. The RRmat was computed for the final modified ROIs and if any off-diagonal elements for a particular ROI were higher than the diagonal element, that ROI was removed.
3.3.2.5 Inter-hemispheric and Inter-subject consistency
6. To create a consistent parcellation across hemispheres and subjects for the group analysis, we applied the following criteria:
6.1 For consistency across subjects, at each of the aforementioned steps, the CTF-based ROI modification was performed in an average source space (fsaverage brain in Freesurfer) and on an average of the rvCTF maps across subjects. To obtain such average rvCTF maps, rvCTFs were computed in the individual source spaces, morphed to fsaverage and averaged over subjects.
6.2 To obtain a consistent parcellation across hemispheres, those ROIs that survived the above criteria in only one hemisphere were removed. Moreover, even though all the procedures were performed in both hemispheres, in order to obtain a symmetrical parcellation, ROIs were kept in the hemisphere that provided a larger number of vertices and mirrored to the opposite hemisphere.
3.3.3 A CTF based region growing segmentation algorithm for the parcellation
Region growing is another algorithm of image segmentation which typically starts by randomly selecting a voxel (pixel) as the first “seed” in an image. Then, based on a pre-specified similarity criterion (e.g. colour or intensity), neighbouring voxels are grouped together with the seed voxel, leading to a growing region around the seed until no more voxels can satisfy the similarity criterion to connect to the cluster (Gonzalez & Woods 2007). Thereafter, a new seed outside the existing cluster is randomly selected in the image and the same procedure will be iterated until all the voxels in the image are assigned to a cluster. In this section, we have adopted a similar idea and have used CTFs to define the similarity criterion to grow regions around the vertices to create and modify ROIs in the brain. Therefore, we started the parcellation at the single-vertex level with no prior ROIs and created ROIs using the following steps:
3.3.3.1 Finding seed vertices
1. The main purpose of the first step was to identify the “seed vertices”, i.e. vertices that show high sensitivity based on the CTFs. Therefore:
1.1. The resolution matrix of the whole brain vertices was computed (section 2.1.2) with rows representing CTFs received at each vertex.
1.2. Sensitivity and specificity steps described in section 3.3.2.2 were applied to the rows of the resolution matrix to find the sensitivity of each vertex to leakage from all other vertices. Those vertices that could “win” more than one vertex were marked as seeds (i.e. highest z-score>3 and at least one standard deviation more than the runner up; see 3.3.2.2 for details).
3.3.3.2 Growing regions surrounding the seeds
2. The second step comprised of growing regions around the seeds. For this purpose, we sorted the seeds in a descending order with the first seed being the “strongest” and created regions in succession following this order.
2.1. Seeds were sorted based on their sensitivity to themselves; i.e. the strongest seed (seed 1) was the seed with the highest sensitivity to itself (i.e. highest z-score section 3.3.2.2).
2.2. All vertices that showed sensitivity to seed 1 (i.e. produced higher cross-talks in seed 1 than half maximum of the CTF values of this seed) were clustered together as ROI1. Next, ROI2 was created from the vertices outside ROI1 with the same half maximum criterion and the same procedure was iterated for all other seeds.
2.3. To obtain an inter-hemispheric symmetry of the ROIs, the created ROIs of the hemisphere with more winner seeds were mirrored to the opposite hemisphere using MNI coordinates.
3.3.3.3 Modifying the ROIs
3. The same procedures as those described in 3.3.2 (except for the splitting step) were applied to the ROIs created by the region-growing (RG) algorithm to obtain the final RG parcellation.
3.3.4 Parcellation performance indices
We used RRmats to evaluate the performance of different original and modified parcellations. As explained earlier, the RRmat is computed by finding the normalised CTF values produced by each ROI at the location of all other ROIs. If a parcellation consists of fully distinguishable ROIs, the RRmat should be an identity matrix. Here we introduce two indices to evaluate a parcellation’s performance:
First, Sensitivity Index (Sind) which measures the sensitivity of ROIs to themselves by taking the mean of the diagonal elements of the RRmat. where N is the number of ROIs in the parcellation.
And second, Distinguishability Index (Dind), which is the correlation between the actual RRmat and the identity matrix of the same size. Where □ denotes the average of matrix elements and I is the identity matrix.
Furthermore, we computed the rank and condition numbers of RRmats to make comparisons between the original structural and modified parcellations. As discussed in the theory section, the leakage problem arises from ill-posedness of the resolution matrix. This results in a calculated rank for the resolution matrix that is notably less than the ideal rank which is the size of the matrix. Hence the number of degrees of freedom is smaller than the number of rows/columns. Considering that RRmat is scaled between 0 and 1, we computed the rank with a tolerance of 0.05 so that if the element-wise difference between the target row and linear combination of other rows is less than 0.05, it is rounded down to 0. A high condition number is indicative of an ill-conditioned ROI resolution matrix, i.e. the estimated sources (output) can be very sensitive to small changes in the actual sources (input). A high condition number indicates that if the RRmat was to be inverted (e.g. to perform leakage correction based on the final RRmat) the results will be unreliable. Additionally, for each parcellation we computed the coverage which is the total number of vertices that are included in the parcellation.
3.4 Simulation with realistic levels of noise
Here, we demonstrate the possible consequences of using different parcellations for graph-theoretical connectivity analyses on simulated data with realistic levels of noise. For this purpose, we simulated a known brain network and tested the performance of different structural and modified parcellations in accurately detecting the hubs and hub connectivity patterns. All simulations were performed in python, and where appropriate (e.g. forward and inverse modelling), we used mne-python software package.
3.4.1 Simulated signals in source space
Sinusoidal signals with signal-to-baseline ratios of 3 were simulated in three areas in Rosrtal Middle Frontal (RMF) cortex, Superior Temporal Sulcus (STS) and Lateral Occipital Cortex (LOC) (ROIs and corresponding CTFs in Fig. B.1 (Appendix B)). The locations of simulated sources where chosen such that they did not favour one parcellation over the others; thus, sources did not exactly coincide with ROIs in any parcellation. Therefore, depending on the parcellation, one or more ROIs might be required to cover the simulated hub location and expected number of hubs varies among different parcellations. For example, while STS source might project to two ROIs in a parcellation with a coarser spatial resolution, it can overlap with three regions in a more fine-grained parcellation. It is worth noting that accurate detection of true hubs in this simulated scenario was difficult and required a high sensitivity of the parcellation ROIs since the hub regions are adjacent and difficult to tease apart (i.e. we have a source of activity which might be partially covered by each of several ROIs). Additionally, in order to have no false alarms, we will need a high specificity of the parcellation. Fifty epochs were simulated consisting of 125ms of noise baseline followed by 600ms of signal. LOC signal was a sine wave with 1nA amplitude and 6 full cycles in 600ms (10Hz) and RMF with 12 cycles in 600ms (20Hz). Both signals had random phase across epochs, and thus no amplitude or phase coupling between LOC and RMF. We introduced non-zero-lag connectivity between STS and RMF/LOC by modelling the STS signal as the sum of the time-shifted signals of RMF and LOC at each epoch. Therefore, high connectivity is expected between STS/RMF and STS/LOC pairs, but low connectivity between the RMF/LOC pair. All other vertices in the brain were given random Gaussian noise with mean and variance equal to that of the sine signal in RMF. Therefore, the overall signal-to-noise ratio (SNR) of the evoked responses in the simulated sensor space was 4.34±0.1 which is typical to the EEG/MEG ERP (Gonzalez-Moreno et al. 2014; Hu et al. 2010). The aforementioned signals were simulated in the brain in the following two scenarios:
Leakage Free (LF): The true simulated sources in the brain were analysed directly, without the application of forward and inverse operators. All signals were simulated in the single subject source space and morphed to the fsaverage space in Freesurfer for further analysis.
Leakage Present (LP): Sources were simulated in the single subject source space, projected into the sensor space and projected back to the source space using the forward and inverse operators respectively (described in 3.2). These source estimations were morphed to the fsaverage space in Freesurfer for further analysis
3.4.2 Connectivity measures
We used Magnitude-Squared Coherence (COH) and imaginary part of Coherency (imCOH) as two measures of connectivity to evaluate the performance of the parcellation methods for detecting whole brain networks in LF and LP scenarios described above. COH and imCOH are spectral measures of connectivity which can detect both amplitude and phase couplings (Greenblatt et al. 2012; Bastos & Schoffelen 2016). We used a multitaper approach with adaptive weights to compute the two measures in a band limited signal of 5-35Hz. COH is sensitive to zero-lag connections while imCOH is not (Nolte et al. 2004; Bastos et al. 2012). We used imCOH as well as COH to evaluate the consequences of the theoretical issue discussed in 2.1.4 and whether or not an EEG/MEG-adaptive parcellation is only needed when a measure susceptible to the zero-lag connectivity is used.
3.4.3 Graph theoretical analysis
We used measures of graph theory to summarise the results of the whole brain connectivity analysis. Simulated data are expected to yield hub(s) in the STS that are connected to the RMF and LOC. We used the node degree as a measure of hubness and in order to determine the degree, the connectivity matrix was thresholded and binarised. To obtain a choice of threshold that is generalisable, we computed the ratio of each node degree to the average node degree of the whole network for a series of thresholds, yielding a matrix that encapsulated the relative importance of a node in the network. We checked that the relative importance remains constant over a range of thresholds in order to have a generalisable threshold. We considered the coherence in the leakage free scenario as the ground truth to obtain the maximum number of connections that can detect the hubs with no misses or false alarm for all the parcellations. This resulted in a threshold defined by the top 6% of the connections that was determined based on the original Desikan-Killiany atlas and yielded accurate hub connectivity maps for all the parcellations except for the original Destrieux atlas where an increase to 9% was required. In the binarised graph that was obtained after thresholding, ROIs (nodes) that showed node degrees (non-zero connections/edges) of 2 standard deviations above the average in an average graph across subjects were marked as hubs. The hub connectivity probability matrix (HCPmat) was then computed from these thresholded matrices to identify the k edges that are most probably linked to each hub, where k is the average hub degree. For example, if the estimated node degree for node A is 3.8, 4 or 4.2, the 4 most probable connections of the node will be kept in the HCPmat. Thresholding and hub detection procedure followed precedent approaches in the literature (c.f. Kaiser 2011; Achard et al. 2006; Buckner et al. 2009).
4 Results
4.1 Parcellation results
4.1.1 Split-and-Merge algorithm (SaM)
We tested the split-and-merge (SaM) algorithm (section 3.3.2) on two standard structural parcellations in Freesurfer: Desikan-Killiany and Destrieux Atlases that are shown in Figure 3a, c with the corresponding ROI Resolution Matrices (RRmat: relative between-ROI leakage values, see 3.3.1) shown in Figure 3b, d, respectively.
4.1.1.1 Desikan-Killiany Atlas
The original Desikan-Killiany Atlas included 68 ROIs with sensitivity index Sind of 0.47 (i.e. the leakage value that each ROI received from itself relative to the rest of the ROIs in the brain) and distinguishability Dind of 0.50 (i.e. correlation between the RRmat and an ideal identity matrix) (Table 1). The SaM algorithm resulted in 316 ROIs at the intermediate step (Fig. B.2a, b; Appendix B), from which 74 regions survived to the final parcellation that is shown in Figure 4a together with the corresponding RRmat. Compared to the original parcellation, Sind and Dind increased by 38% and 22% and reached 0.65 and 0.61 respectively (Table 1) and provided a sparser sampling of the cortex including 4079 vertices.
4.1.1.2 Destrieux Atlas
The original Destrieux Atlas consists of 148 ROIs and is shown in Figure 3c with RRmat in Fig. 3d. In comparison to the Desikan-Killiany parcellation, the RRmat of this parcellation shows less similarity with an identity matrix, indicating a more blurred estimation of activity for each of the ROIs (Table 1). This difference suggests that the original Desikan-Killiany is a better match to the EEG/MEG spatial resolution than Destrieux. Sind and Dind of Destrieux Atlas were 0.37 and 0.38, respectively, and improved to 0.7 and 0.65 for the 74 ROIs that survived the parcellation modification, providing an 89% and 71% improvement in these indices, respectively. The parcellation covered 3084 vertices of the cortical surface. The intermediate and final parcellation/RRmat for the modified Destrieux Atlas are shown in Fig. B.2c, d and Fig. 4b respectively. Comparison to Fig. 3d, as reflected in increased Sind and Dind values above, shows a clear improvement. Note that in Fig. 4b, ROIs that showed maximum overlap with each of the modified ROIs from the Desikan-Killiany are colour-matched to Fig. 4a for visual comparison.
Despite having twice the number of initial ROIs, the SaM algorithm converged at 74 ROIs for both atlases. This can be considered as an indicator of the robustness of the parcellation algorithms against the initial choice of parcellation.
4.1.2 Region Growing algorithm (RG)
The Region Growing Algorithm does not require an anatomical parcellation as a starting point, but creates a parcellation based on the resolution properties of all the vertices. The first step of RG algorithm identified 174 seed vertices (Fig. B.2e) in the left hemisphere and ROIs were grown surrounding each of these seeds using the criteria described in 3.3.3. The split and merge criteria were applied to these created ROIs and resulted in a 70-ROI parcellation with Sind of 0.7, Dind of 0.64 and a sparse sampling of the whole cortex, covering 3086 out of 20484 vertices in the brain (Table 1). The final parcellation showed notable similarities and differences to the parcellation modification of the structural atlases (Fig. 4c). A direct comparison of the overlaps and differences of the final parcellations are conducted in section 4.2.
These results demonstrate that our algorithms improve sensitivity and specificity of the original structural parcellations. In the following, we will analyse features of our algorithms in more detail.
4.2 Effect of initial choice of parcellation
As can be seen in Fig. 4, some of the final ROIs, particularly in the occipital, temporal and frontal lobes show overlaps across the three parcellations, while other regions in the central and parietal lobes can vary notably. All final parcellations in Fig. 4 are colour-matched to the first parcellation (modified Desikan-Killiany parcellation). To obtain a more direct comparison between the ROIs, we computed the overlaps, normalised by the sizes of ROIs (Fig. 5). More specifically, we took the modified Desikan-Killiany parcellation as the reference and found the overlaps between the colour-matched ROIs in Fig. 4. Rows of the matrices in Fig. 5 illustrate the overlaps between each of the ROIs of the parcellation on the y-axis (Py) with all the ROIs of the parcellation on the x-axis (Px: always modified Desikan-Killiany), which is normalised by the size of that ROI of Py. Therefore, if there is only one yellow/white column corresponding to each row, it shows a one-to-one correspondence between the two intersecting ROIs while several red/orange columns intersecting with each row show that one ROI in Py is overlapping with several regions in Px. If one row consists of only dark colours, that ROI in Py is not overlapping with any ROI in Px. As can be seen in Fig. 5, we found that a majority of ROIs show a one-to-one correspondence between the final parcellations, with different degrees of overlaps. However, there are also several cases where an ROI in one parcellation overlaps with a few ROIs or cases where an ROI does not have any matches in another parcellation.
4.2.1 Rank and condition number of final RRmats and implications
Here we compared the rank and condition numbers of RRmats for the original and modified parcellations. The resolution matrix, as expected, was highly ill-conditioned and while the ideal rank was 20484 in our study, the calculated rank was only 118. Parcellations (structural or modified) downsampled the source space to a few hundred ROIs and thus improved the rank. We found a rank of 49 (ideal 68) and 92 (ideal 146) for the Desikan-Killiany and Destrieux atlases respectively, which, in spite of showing an improvement compared to the original source space, are still not full-ranked. In contrast, the modified parcellations showed near-perfect performance where we found ranks of 73 (ideal 74), 74 (ideal 74) and 70 (ideal 70) for the modified Desikan-Killiany, Destrieux and RG parcellations respectively. Even though full-ranked matrix guarantees independence between the ROI signals in the modified parcellations, the output might still be very sensitive to small changes in the input; hence a small condition number is desired. The condition numbers for the Desikan-Killiany and Destrieux atlases were 1.26xl03 and 1.78xl04 which were significantly improved to 114.38, 70.82 and 91.59 for the modified Desikan-Killiany, Destrieux and RG parcellations respectively. However, it is worth noting that condition numbers around 100 in the modified parcellations are still high and invite other complementary approaches to be used together with the EEG/MEG-adaptive parcellations. Some of these approaches will be discussed later.
4.3 Simulation results
We applied different parcellations to simulated data with known connectivity structure and used coherence (COH) and imaginary coherence (imCOH) to compute connectivity among all the ROIs in the parcellation. We simulated data with realistic levels of noise (evoked SNR ˜ 4) and a hub region in the superior temporal sulcus (STS) with connections to ROIs in rostral medial frontal (RMF) cortex and lateral occipital cortex (LOC), all in the left hemisphere. The expected numbers of hubs for each parcellation are listed in Table 2.
4.3.1 Hub detection accuracies
We used binarised graphs to detect hubs and tuned the thresholding (section 3.4.3) so that coherence in the presence of noise and in the absence of leakage (LF) could identify the hubs with no misses or false alarms (FA). Results are summarised in Table 2, where the “hubs” column shows the ground truth number of hubs corresponding to each parcellation.
Coherence: In the presence of leakage and noise scenario for original Desikan-Killiany atlas, coherence yielded 1 hit out of 2 hubs (50% miss) and showed 1 FA (Fig. 6a) while the modified version of this parcellation had no misses and 1 FA (Figure 6c) in the same scenario. For the original Destrieux Atlas, we found 3 hits out of 4 (25% miss) in the presence of leakage (Fig. 6b), and 1 FA which was improved to no misses and 1 FA after modification (Fig. 6d). RG, like other modified parcellations, showed no misses and 1 FA (Fig. 6e).
imCOH: In the absence of leakage, imCOH was less accurate as compared to the coherence in the absence of leakage. We found that even though imCOH shows no misses, FAs are likely for the original parcellations. In the presence of leakage, imCOH showed no misses and 2 FAs for the original Desikan-Killiany atlas while no misses and 4 FA for the modified version (i.e. no improvement in hub detection but improvement in hub connectivity patterns described below) (Fig. 6a, c). Moreover, it showed 1 miss and 4 FAs for the original Destrieux Atlas which improved to 0 misses and 2 FAs in the modified version (Fig. 6b, d). RG showed no misses or FAs with imCOH (Fig. 6e). ImCOH improved sensitivity to true hubs but increased false alarms notably for all the parcellations except for the modified RG approach (Table 2).
4.3.2 Hub connectivity patterns
The hub connectivity patterns were summarised using HCPmats as described in 3.4.3. HCPmats for the structural and modified parcellations are shown in Fig. 6. These figures show the k most probable ROIs that are connected to each hub where k is the estimated hub degree and the colour of each connection shows the probability of that connection being present in the binarised connectivity matrices across subjects (brighter colours show higher probability). We computed the 2-dimensional correlation coefficient between these HCPmats and an ideal HCPmat in no-leakage and no noise scenario (ground truth) for each parcellation. The ground truth is shown in the left-most panels of Fig. 6. The ideal parcellation should retrieve these connectivity patterns for each parcellation.
Coherence: Firstly, we found that for the coherence and in the presence of leakage, Desikan-Killiany and Destrieux parcellations showed 0.28 and 0.31 correlation to the ideal HCPmat and these values increased to 0.47, 0.49 and 0.49 for the modified structural parcellations and RG approaches, respectively.
imCOH: Secondly, we found that ImCOH showed less correlation to the ideal scenario, both in the presence and in the absence of leakage. In the presence of leakage, imCOH showed 0.23 and 0.21 correlation with ideal HCPmat for the original structural parcellations and these values improved to 0.33, 0.43 and 0.53 for the modified parcellations (detailed summary in Table 2).
5 Discussion
We used cross-talk functions (CTFs), which describe the spatial resolution of linear or linearly constrained distributed source models, to create EEG/MEG-adaptive parcellations of the cortex as a basis for connectivity and graph theory analysis of EEG/MEG data in source space. We implemented two algorithms inspired by the image processing and clustering literature–split-and-merge (SaM) and region growing (RG) – which differed with respect to the starting points of the parcellation process. For SaM, we started from two different standard anatomical parcellations with different average sizes of ROIs (Desikan-Killiany (Desikan et al. 2006) and Destrieux (Destrieux et al. 2010) Atlases) and modified the ROIs using a CTF-informed split-and-merge algorithm. For RG, we started with no prior parcellation and created a parcellation using a combination of RG and SaM algorithms. We used metrics for distinguishability and sensitivity based on ROI resolution matrices (RRmats) to quantify the performance of different parcellations, using a data set consisting of combined EEG and MEG measurements.
All three analyses yielded approximately 70 distinguishable ROIs in the brain, suggesting that this reflects the general resolution limits of the utilised measurement configuration and source estimation methods. All approaches provided a sparse sampling of the cortex, and significantly improved the parcellation performance compared to the structural parcellations with respect to sensitivity and distinguishability of ROIs, while at the same time maximising the number of distinguishable ROIs in the brain. In a simulated connectivity example, we illustrated that the choice of parcellation can have significant impact on the outcome of graph theoretical analysis of EEG/MEG data in source space.
5.1 Adaptive parcellations for the spatial limitations of EEG/MEG
EEG/MEG studies typically adopt structural or fMRI-based functional parcellations. For example, Hillebrand et al. (2012) used the Talairach Daemon Database for the parcellation of the brain, Colclough et al. (2015, 2016) used the Harvard-Oxford structural parcellation and ICA-based fMRI parcellation, while several other studies have used the Automatic Anatomical Labelling (AAL) atlas (Tewarie et al. 2014; Tewarie et al. 2016; Brookes et al. 2016). Nevertheless, as described in the theory section, structural ROIs are unlikely to be optimal for EEG/MEG analysis. Palva et al. (2010) presented the first study that has used an EEG/MEG-informed parcellation for single subject connectivity analysis of the whole brain. They utilised the forward and inverse modelling of simulated noise in source space and clustered 365 (i.e. equal to the number of sensors) patches on the cortex that showed high within-patch phase synchrony. Korhonen et al. (2014) introduced sparse weights to collapse the source space, based on the forward and inverse modelling of simulated noise in the source space, so that vertex selection is optimised for a fixed set of predefined structural ROIs, which is suitable for group as well as single subject analysis. However, obtaining a parcellation that can optimise both parcellation resolution (i.e. number of ROIs in a parcellation) and vertex selection with respect to EEG/MEG spatial limitations, has remained a challenge (Korhonen et. al 2014). Additionally, both previous studies have defined a parcellation based on a specific connectivity metric (i.e. phase locking) rather than a generalisable metric of spatial resolution that can be used with any connectivity measures. In this study, we addressed these problems systematically by utilising the CTFs as a direct measure of spatial leakage. We used a state-of-the-art measurement configuration containing EEG and MEG sensors, realistic individual boundary element models (Fuchs et al. 2002) and a common source estimation method (L2 minimum norm estimation) that makes minimal assumptions about the source configuration (Hämäläinen & llmoniemi. 1994; Hauk 2004). Our novel methods are suitable for group as well as single subject analysis.
Overall, the parcellation algorithms implemented here are adaptive and can change depending on the choices of EEG/MEG measurement configuration, head model and source estimation methods. Therefore, since it has been shown previously that combining EEG and MEG provides higher spatial resolution (Fuchs et al. 1998; Molins et al. 2008; Henson et al. 2009) it can be expected that EEG or MEG on their own will result in a smaller number of surviving ROIs than for their combination. Furthermore, different source estimation methods will result in different CTFs. It is important to note that due to equation 3 all CTFs, regardless of the inverse methods used, are linear combinations of the leadfields. Thus, CTFs that are not in the space of the leadfields cannot be achieved by any method. In our study, we used L2 minimum norm estimation because it results from the minimisation of the difference between the resolution matrix and the identity matrix (Dale & Sereno 1993; Hauk 2004). The shapes of CTFs for this method are the same as for noise-normalised minimum norm estimates such as dSPM and sLORETA (Hauk et al. 2011). Therefore, we think that our results reflect the optimum of what can be achieved without more specific modelling constraints. In studies where other constraints are justified, e.g. when other families of spatial filters such as beamformers (Van Veen et al. 1997; Barnes et al. 2006) are used, different parcellations of the cortex may be obtained from the algorithms. Also, we have used a common boundary element model (BEM) in our forward computations (Hämäläinen & Sarvas 1989; Mosher et al. 1999). Using other multi-layer headmodels or Finite Element Models (FEMs) (Buchner et al. 1997) may also change the parcellations.
5.2 Different parcellation approaches: similarities and differences
Our proposed parcellation algorithms addressed the three theoretical issues of using structural ROIs with EEG/MEG that were discussed in the Theory section (Fig. 1). Firstly, we found a limited sensitivity to the signals that are produced in deeper brain areas. All three parcellations (Fig. 4 showed almost no coverage of the medial view of the cortex indicating the relative insensitivity of our source estimation to these deeper brain areas. Secondly, the specificity of the anatomical parcellations did not match that of the EEG/MEG parcellations: On the one hand, some fine-grained neighbouring areas were not distinguishable. For example, the four areas pars-triangularis, pars-orbitalis, pars-opercularis and lateral orbitofrontal cortex from the Desikan-Killiany atlas (Fig. 3a) were merged into two areas in the anterior and posterior inferior frontal gyrus in the modified version of this atlas (Fig. 4a). On the other hand, large ROIs such as pre- and post-central gyri were split into smaller ROIs (cf. Fig. 3a and Fig. B.2a).
The two SaM and RG approaches showed highly overlapping final ROIs for all three final parcellations, which indicates the robustness of the proposed algorithms with respect to the initial choice of parcellation. This indicates that the final parcellation of the cortex is mostly influenced by the choices of measurement configuration, head model and source estimation method. However, as shown in section 4.2, we observed notable differences as well, in that not all the parcellations provide a similar sparse sampling of all brain areas. For example, as can be seen in Fig. 4, while the final RG parcellation includes several ROIs in the temporal lobe, the modified Destrieux parcellation provides a better coverage of centro-parietal cortices. More generally, the SaM approach is based on anatomically defined regions and thus provides a better solution for optimising the number of a priori selected ROIs or testing specific hypotheses. In contrast, the RG approach is most distinct from anatomical ROIs and limitations that they could impose on detection of functional networks. Therefore, it might be more desirable for data-driven whole brain connectivity analyses, e.g. for resting state networks.
5.3 Non-zero-lag connectivity does not obviate the need for EEG/MEG-adaptive parcellation
Non-zero-lag connectivity measures have been introduced to alleviate the leakage problem (Nolte et al. 2004; Stam et al. 2007). We investigated whether using non-zero-lag connectivity can resolve the need for an adaptive parcellation for whole-brain network analysis. We used magnitude-squared coherence (COH) and imaginary part of coherence (imCOH) as spectral measures of amplitude and phase coupling (Greenblatt et al. 2012; Bastos & Schoffelen 2016). While COH is sensitive to zero- as well as non-zero-lag connections, imCOH is only sensitive to the latter. We argued (section 2.1.4) that even bivariate and multivariate non-zero-lag connectivity measures are affected by leakage. In our simulations, we showed that long-range spurious connections between a seed and a target can occur due to leakage to the target. This can affect the final hub connectivity patterns obtained from imCOH when binarised graphs are used to obtain hubs and hub connections. For example, when the same threshold is applied to binarise COH and imCOH matrices (e.g. top 6% of connection in our simulation example), removing true zero-lag connections in imCOH increases the probability of keeping the false long-range non-zero-lag connections that are induced by the leakage (i.e. leakage-induced connections are large) and thus it is more likely to highlight the ROIs that receive such spurious connections as spurious hubs. The effects of combining different parcellations with different connectivity measures on the outcome of graph-theoretical analyses should be studied in more detail in the future.
5.4 Practical notes
Here we discuss two practical considerations. Firstly, the parcellation introduced in this study is defined in a standard source space where CTFs computed in the single subject space are morphed to a standard space and averaged across a group of subjects for further analysis. These standard ROIs can be morphed to the individual source spaces for the single subject analysis. In a series of trials that are not reported here, we found that ROIs defined in single subject space are highly inconsistent across subjects. This is firstly due to the fact that the sizes of ROIs can vary largely across subjects and secondly, some overlapping vertices might be assigned to different structural labels in different subjects. Therefore, we conclude that in order to obtain a consistent set across subjects and robustness to noise, ROIs can be defined in a standard canonical space and, if single subject connectivity analysis is of interest, ROIs can be morphed to the individual source space.
Secondly, there are several SaM and RG parameters that can be adjusted in order to obtain a parcellation that is most suitable for the questions of a study; here we used generalizable parameters based on the values commonly used for similar purposes in the literature. First, in order to assign a vertex to an ROI we used half-maximum of CTF values as a measure of sensitivity. Half-maximum is commonly used in signal processing as a measure of sensitivity in order to provide a cut-off to assign a set of values to a given peak. In signal processing, it corresponds to ~3dB attenuation in the power of the signal (below which the signal is considered damped) (Oppenheim & Schafer. 2010). Second, we used z-scores above 3 for sensitivity and specificity of a vertex to an ROI. A Z-score above 3 for Gaussian distributions corresponds to a p-value<0.005, showing that a vertex is significantly more sensitive to a given ROI as compared to any other ROIs. Third, we allowed at least one standard deviation between the ROI with highest specificity and the ROI with next highest specificity (see Methods section for details). These values can be considered “standard” to provide a reasonable trade-off between the sensitivity, specificity and maximising the number of distinguishable ROIs. However, if there are clear requirements for sensitivity versus specificity, these values can be adjusted to adapt the parcellation accordingly. Another parameter is the minimum number of vertices that are required to form a separate ROI. We heuristically selected a minimum of 10 vertices, in order to exclude very small ROIs that might be significantly affected by slightly changing other parameters of parcellation.
5.5 Future directions
The final ROI resolution matrices reveal that even though the parcellation performance is notably improved, the RRmats are still significantly different from an ideal identity matrix. Modified parcellations were most successful in increasing sensitivity of ROIs to themselves. This was reflected in the diagonal elements of the final ROI resolution matrices as well as the simulations to detect hubs and hub connectivity patterns. However, the RRmats of the final parcellations still showed several large off-diagonal elements which affect the specificity, particularly if two active sources in the brain are neighbours. This was reflected in our simulations where we found false alarms for the original as well as the modified parcellations. Therefore adaptive parcellations could be used with complementary methods that can further improve the specificity to result in a more accurate network reconstruction.
One such complementary method might be to combine adaptive parcellations with multivariate connectivity. In the theory section and Appendix A, we have discussed how multivariate and non-zero-lag connectivity methods can be affected by the leakage, and considering the linear nature of CTFs and based on the multivariate covariance as an example, we discussed that leakage coefficients could be taken into account in order to quantify the effects of CTFs on multivariate connectivity analysis. These leakage coefficients can be extracted from the RRmats. Therefore, we suggest that modified parcellations and RRmats might be used together with multivariate and time/phase-lagged estimates of connectivity, to get more direct and directed measures of whole-brain graphs. It is worth noting that computing RRmats for any given parcellation (e.g. anatomical parcellations) to inform the multivariate connectivity analysis might not result in an accurate reconstruction of whole-brain networks. This is due to the fact that standard anatomical parcellations are likely rank-deficient (section 4.2.1) which indicates that signals of one or more ROIs are inherently dependent on a linear combination of other ROIs in the brain and cannot be estimated accurately. On the contrary, the parcellation algorithms in this study yielded full-ranked RRmats within a reasonable tolerance, suggesting that one can derive N independent signals for N ROIs yielded by the parcellation algorithms. Therefore, obtaining distinguishable CTF-based ROIs is an essential first step and how to combine these adaptive parcellation methods with different connectivity methods will be an important question for future studies.
7 Conflict of interest
The authors declare no conflicts of interest.
6 Acknowledgements
This work was supported by a Cambridge University international scholarship award to S.F and UK Medical Research Council grants to R.N.H. (MC_A060_5PR10) and O.H. (MC_A060_53144). The authors would like to thank Dr. Darren Price for commenting on an earlier version of this manuscript and Dr. Karalyn Patterson, Dr. Anna Woollams and Dr. Elisa Cooper for contributing to the real datasets utilised in this study.
APPENDIX A: The effect of leakage on multivariate connectivity
We can generalise the bivariate (two-ROI) example discussed in section 2.1.4 to multivariate methods for estimating the unique (partial) covariance between pairs of ROIs in a network of connections between three or more ROIs. In Fig. 1 d, consider a seed in the RMF (region Y), a target in the MTG (region Z) and a new region X within the leakage realm of MTG. Let us assume that the true source in Z co-varies with Y, but true connectivity between X and Y is zero. Let us further assume, for the sake of simplicity, that the whole network only consists of these three regions and Y does not receive/send leakage from/to any other ROIs. Therefore, considering the linear and time-unvarying effects of leakage, the estimated X and Z signals will be a linear combination of true signals at these regions (X’ and Z’ respectively) while the estimated Y activity equals the true source activity Y’ and can be written as: where α1 and β1 are the amount of leakage that X receives from itself and true Z’ source respectively and α2 and β2 are the amount of leakage that Z receives from true X’ source and itself respectively. Therefore, in the scenario outlined above, COVx’y’=0 and in order for the partialling of covariance to overcome leakage, it should yield COVXY|Z=0.
Therefore, COVXY|Z≠0. The only exceptional case is when the true source X’=μx’· (i.e. inactive), β1=β2=l(i.e. Z and X are equally influenced by the leakage from Z), Z’ has unit variance and, thus, COVXY = 0. Even though the second condition (β1=β2=1) might be obviated using normalised measures of co-variation, the first and third conditions are unlikely to be true for the whole brain network analysis. This argument, likewise for the bivariate methods, might be generalised to time-lagged connectivity measures (e.g. multivariate autoregressive modelling).
Even though the above examples argue that leakage cannot be resolved using non-zero-lag or multivariate connectivity measures, Equations A2-A4 show that quantifying leakage between ROIs (i.e. coefficients α1 α2 β1 β2) and combining them with multivariate connectivity measures might provide a more accurate reconstruction of whole brain networks using source reconstructed EEG/MEG data. In this study we concentrated on the former.
APPENDIX B Simulated ROIs
The ROIs in Rostral Middle Frontal (RMF), Superior Temporal (STS) and Lateral Occipital (LOC) cortices used for simulations (section 3.4.1) and corresponding CTFs are shown in Fig. B.1.
Initial results of the parcellation algorithms
fig.B.2 shows the intitial split and merged ROIs which were input to the final parcellation procedure.