ABSTRACT
Epilepsy is a global epidemic and 30% of the 60 million patients do not respond to medication treatment. The only treatment options for patients with medically refractory epilepsy are surgical removal or electrical stimulation of the epileptogenic zone (EZ) i.e. the source of their seizures. Despite extensive evaluations with neuroimaging, visual EEG analysis and clinical testing, surgical success rates vary between 30-70%. Currently, no computational methods have been translated into the clinic to assist in localizing the EZ. Here, we applied a dynamical network model that quantifies the fragility of nodes within a patient’s intracranial EEG (iEEG) brain network. Fragility is quantified as the minimal amount of perturbation that must to be applied to a node’s influence on a “balanced” network to cause imbalance. Here, a balanced network is one in which the connectivity between excitatory and inhibitory nodes render a stable system, and an imbalanced network is unstable and hence can generate seizures. Using iEEG data from 91 patients treated across 5 epilepsy centers (44 successes, 47 failures), we demonstrated that nodal fragility is greater in electrodes within the EZ. In addition, we compared fragility of iEEG nodes to 7 frequency-based and 14 graph theoretic features of the EZ in both seizure (n=91) and non-seizure data (n=54). We calculated a confidence statistic, defined as the ratio of the value of a given feature averaged across electrodes in the clinically annotated seizure onset zone to its average across all other electrodes. Fragility has a significantly greater effect size difference between surgical outcomes when compared to other features. This novel feature, outperformed the most popular iEEG features when comparing across surgical outcomes, possibly defining a superior network-based EEG fingerprint for the EZ.
1 Introduction
Over 15 million epilepsy patients worldwide and 1 million in the US suffer from medically refractory epilepsy (MRE) [1, 2, 3, 4, 5]. MRE is defined by the International League Against Epilepsy as continued seizures despite adequate trials of two tolerated, appropriately chosen and administered anti-epileptic drugs. MRE patients have an increased risk of sudden death and are frequently hospitalized, burdened by epilepsy-related disabilities, and are a substantial contributor to the $16 billion dollars spent annually in the US treating epilepsy patients [6, 7]. Approximately 50% of MRE patients have focal MRE, where a specific region or set of regions in the brain is the source of the abnormal electrical activity resulting in seizures, termed the epileptogenic zone (EZ) [8, 9, 10, 11]. Although there exist neurostimulation treatments for focal MRE [12, 13], the mainstay of treatment has been surgical resection of the EZ ever since randomized control trials demonstrated superior outcomes compared to prolonged medical therapy [14]. When successful, these treatments stop seizures or allow them to be controlled with medications. Outcomes for both treatments depend critically on the clinician’s ability to accurately identify the EZ. However, no clear bio-markers have been identified for the EZ.
When non-invasive evaluations with electroencephalography and neuroimaging are inconclusive for localizing the EZ, clinicians transition to an invasive monitoring phase, termed a phase 2 evaluation, as shown in Figure 1, during which electrodes are either placed on the brain surface via strips, or grid (i.e. ECoG) or implanted into the brain via stereo-EEG (SEEG) to record intracranial EEG (iEEG) data. Clinicians visually inspect iEEG recordings captured over the course of multiple days to weeks, analyzing multiple seizure, (i.e. ictal) events looking for various epileptic signatures, such as spikes and high frequency bursts [15, 16, 17, 10, 11]. They then form a hypothesis from such iEEG features and from non-invasive imaging data that results in annotation of the seizure onset zone (SOZ) electrodes, the region they visually see as participating in early onset of seizures; the SOZ is the best clinical estimate of the true underlying EZ. This SOZ is generally included in the resected zone, except in the case of surgical limitations such as proximity to critical functional regions. Despite the wealth of data collected, localization is performed through visual inspection of data with limited use of computational tools [18, 19, 20, 21, 22]. Furthermore, adequate localization is significantly affected by the pre-implantation hypothesis generated from non-invasive testing (which guides brain areas to be explored invasively) and the results of visual inspection of the iEEG data.
There is a great need for computational approaches to identify robust iEEG iEEG fingerprints of the EZ. Since there does not exist ground-truth for the EZ, approaches must aim to identify robust iEEG features of the SOZ from large retrospective samples of patients with successful and failed surgeries. Recent studies have focused on the hypothesis that iEEG nodes exhibiting high-frequency oscillations (HFOs) during the segments between seizure periods (i.e. interictal are correlated to the EZ [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36]. However, HFO studies have conflicting results due to varying algorithms, interpretations of HFOs and inter-rater reliability [34, 37]. Other promising approaches include graph-based analysis of iEEG [38, 39, 40, 41, 42, 43, 44, 45, 46], power-spectrum analysis of ieeg [47, 48, 49, 50, 51, 52] and predictions from computational modeling of the iEEG network [53, 54, 55, 56, 57, 58]. Although many features have been proposed in the literature, none have yet translated into the clinical workflow for a number of reasons: 1) lack of sufficient clinical validation on large and diverse patient populations, 2) attempt to predict the EZ with no ground truth labels, and 3) lack of sufficient validation against other proposed methods.
In this study, we explored a novel network-based iEEG feature, fragility, which is based on dynamical systems theory [59, 60], for the purposes of SOZ localization. We systematically evaluated 21 iEEG features that have been introduced in the research community and compared them to fragility on a diverse sample of patients from five centers including both ictal (n=91) and interictal (n=54) data. Identifying strong features is challenging in this setting because we do not have a ground truth. We claim that a robust iEEG feature must satisfy the following criterion: i) it should be take on extreme values inside the clinically labeled SOZ nodes for successful outcome cases, and ii) it should be ambiguous or take on extreme values in contacts outside the clinically labeled SOZ for failed outcomes. To quantify the robustness of a candidate iEEG feature, a confidence statistic (CS), defined as the ratio of the feature values averaged across the SOZ electrodes to its average across all electrodes, is computed. In this framework, lower confidence scores suggest that patients will have failed outcomes, while higher confidence scores suggest successful outcomes. We found that electrodes with higher fragility values were more likely to be in the SOZ in successful outcomes, but outside the SOZ in failed outcomes.
2 Methods
2.1 Dataset collection
EEG data from 91 epilepsy patients who underwent intracranial EEG monitoring – either electrocorticography (ECoG), or depth electrodes with stereotactic EEG (SEEG) – were selected from University of Maryland Medical Center (UMMC), University of Miami Hospital (UMH), National Institute of Health (NIH), Johns Hopkins Hospital (JHH), and the Cleveland Clinic (CClinic). Patients exhibiting the following criteria were excluded: patients with no seizures recorded, pregnant patients, patients less than 5 years of age, patients with an EEG sampling rate less than 250 Hz, patients with previous surgeries before the current implantation, and patients in which no surgery was performed. All 91 remaining patients had a surgical resection or laser ablation performed. Of these patients, 44 experienced successful outcomes and 47 had failed outcomes (average age at surgery = 31.52 ± 12.32 years) with a total of 462 seizures (average ictal length = 97.82 ± 91.32 seconds) and 14703 total number of recording electrodes (average number implanted = 159.82 ± 45.42). [61, 62, 16]. For each patient, we aggregated data from multiple ictal snapshots and interictal portions of data (if available). In the 54 patients with interictal data, we collected 1-2 snapshots per patient, totaling 75 snapshots that were at least 3-24 hours from an ictal event (average age at surgery = 31.88 ± 12.24 years; average snapshot length = 270.92 ± 194.19 seconds; 6932 total number recording electrodes. We were unable to collect interictal data for all patients because the data was not stored long-term in many cases.
For each patient, we combined surgical notes and postoperative follow-up information regarding how the resection or ablation affected the patient’s seizures. We categorized patients by surgical outcome (success = seizure free 6-12 months post-op and failure = seizure recurrence), and by Engel score as determined by clinicians [63]. In addition, we categorized patients by their clinical complexity (CC) as follows: (1) lesional, (2) focal temporal, (3) focal non-temporal, and (4) multi-focal (Figure 1) [16, 17]. Each of these were categorized based on previous outcome studies that support this increasing level of localization difficulty. Lesional patients have success rates of , experiencing the highest rate of surgical success because the lesions identified through MRI are likely to be part of the EZ [64, 65, 21, 66, 20, 67]. Localization and surgical success in seizure control are even more challenging in patients with non-lesional MRI. Patients can then be further categorized into temporal, focal non-temporal, and multi-focal lobe epilepsy experience average surgical success rates of respectively [68, 69, 70, 71]. Patients that fit into multiple categories were placed into the more complex category. Next, electrodes that were clinically identified as part of the SOZ were hypothesized to be part of the EZ. In general, this was a subset of the resected region for all patients, unless otherwise noted. The epileptologists define the clinically annotated SOZ as the earliest electrodes that participated in seizures. The corresponding SOZ complement, or SOZC are the electrodes that are not part of the SOZ. Every patient’s clinical SOZ was labeled by 1-3 trained epileptologists. The electrodes within the resected region were also estimated from surgical notes. Obtaining rigorous labels for resected regions would require postoperative T1 MRI and CT scans, which were not readily available for all patients. After the proposed surgery based on SOZ annotation, patients were categorized into either a successful (i.e. seizure free), or failed (i.e. seizure recurrence) outcome at 6-12 months post-op. For more detailed information regarding the patient population, see Supplemental Figure S1 and Supplemental clinical data summary Excel file.
At all centers, data were recorded using either a Nihon Kohden (Tokyo, Japan) or Natus (Pleasanton, USA) acquisition system with a typical sampling rate of 1000 or 2000 Hz (for details regarding sampling rate per patient, see Supplementary file table). Signals were referenced to a common electrode placed subcutaneously on the scalp, on the mastoid process, or on the subdural grid. At all centers, as part of routine clinical care, up to three board-certified epileptologists marked the EEG onset and the termination of each seizure by consensus. The time of seizure onset was indicated by a variety of stereotypical electrographic features which included, but were not limited to, the onset of fast rhythmic activity, an isolated spike or spike-and-wave complex followed by rhythmic activity, or an electrodecremental response. The clinicians then clipped sets of data and passed it through a secure transfer for analysis in the form of the European Data Format (EDF) files [72]. Each ictal snapshot available for a patient was clipped at 30 seconds before and after the ictal event, and each interictal snapshot was a period of 30-90 seconds of data 3-24 hours away from a seizure event. We discarded electrodes from further analysis if they were deemed excessively noisy by clinicians, part of white matter, or were not EEG related (for example: reference, or EKG, or not attached to the brain) which resulted in 97.23 ± 34.87 electrodes used per patient in our analysis. We stored data in the BIDS-iEEG format and performed processing using Python3.6 and MNE [73, 74, 75, 76].
Decisions regarding the need for invasive monitoring and the placement of electrode arrays were made independently of this work and solely based on clinical necessity. All data were acquired with approval of local Institutional Review Board (IRB) at each clinical institution. The acquisition of data for research purposes was completed with no impact on the clinical objectives of the patient stay. Digitized data were stored in an IRB-approved database compliant with Health Insurance Portability and Accountability Act (HIPAA) regulations.
2.2 Preprocessing of data
In our analysis of the iEEG data, we performed the same preprocessing on all datasets. Each dataset was notch filtered at 60 Hz and bandpass filtered between 0.5 and the Nyquist frequency with a fourth order Butterworth filter. A common average reference was applied to remove any correlated noise [77]. All filtering steps were applied to both ictal and interictal snapshots the same way. EEG sequences were broken down into sequential windows and the features were computed within each window (see Methods 2.3, 2.4 and 2.5). Each proposed feature produces a value for each electrode for each separate window, and results in a full spatiotemporal feature heatmap when computed over sequential windows. In total, we computed 22 different feature representations from the iEEG data: 6 frequency power in band (PIB), 7 eigenvector centralities (one for each frequency band coherence connectivity matrix and one for a correlation connectivity matrix), 7 in-degrees (one for each frequency band coherence connectivity matrix and one for a correlation connectivity matrix), 1 HFO and our proposed fragility feature. Values at each window of time were normalized across electrodes to values between 0 and 1, to allow for comparison of relative feature value differences across electrodes over time; the higher a normalized feature, the more we hypothesized that electrode was part of the EZ [60]. For HFOs, the rates were computed first, and then normalized, since it does not produce a spatiotemporal map like the other features.
2.3 Fragility of a network
The notion of fragility is derived from the concept that an epileptic network is inherently imbalanced with respect to connectivity between inhibitory and excitatory populations (nodes). That is, if a specific node or set of nodes is perturbed, over excitation may occur manifesting in a seizure. From a dynamical systems point of view, such imbalance arises from a few fragile nodes causing instability in the network.
We introduced the fragility of a node in [59], and defined it as the minimum perturbation applied to the node’s connectivity to its neighbors before rendering the network unstable. In system theory, stable systems return to a baseline condition when a node is perturbed. In contrast, unstable systems can oscillate and grow when a node is perturbed. A fragile node is one that requires a smaller perturbation to lead to ictal activity. We showed how to compute fragility from a stable dynamical network model in [59]. We then described how to estimate such a model from iEEG recordings for two patients in [60].
To demonstrate how fragility is computed from a model, we consider a 2-node network as shown in Fig. 2. In Fig. 2A, a stable network is shown where excitation and inhibition are balanced. The network model is provided in the top row and takes a linear form of x(t + 1) = Ax(t). When the inhibitory node is stimulated by an impulse, the EEG responses from each node transiently respond and return to baseline (bottom row). In Fig. 2B, the inhibitory node’s connections are slightly perturbed in a direction that makes the inhibitory node less inhibitory (see red changes to its connectivity to the excitatory node). These changes are reflected in the model and diagram. Now, when the inhibitory node is stimulated by an impulse, the EEG responses from each node have a larger transient response but still return to baseline. Finally, in Fig. 2C, the inhibitory node’s connections are further perturbed in a direction that makes the inhibitory node less inhibitory. Now, when the inhibitory node is stimulated by an impulse, the EEG responses oscillate demonstrating that the network has gone unstable. The fragility of the inhibitory node is thus quantified as which is the norm of the perturbation vector applied to the first column in the network model.
To compute fragility heatmaps from iEEG recordings, we constructed simple linear models as described above but one for each 500 ms iEEG window. We used a sparse least-squares with a 1e − 5 l2-norm regularization to ensure that the model identified was stable as in [78, 60]. Then, we slid the window to capture the next 500 msec of iEEG data and repeated the process, generating a sequence of linear network models in time as in Figure 3b). We systematically computed the minimum perturbation required for each electrode’s connections (see Figure 3b) to produce instability for the entire network as described in [60]. The electrodes that were the most fragile were hypothesized to be related to the EZ in these epilepsy networks (seen as the bright yellow color in Figure 4).
2.4 Baseline features – HFOs and spectral features
In order to compare fragility with HFO analysis, we applied an open-source algorithm for detecting HFOs [30, 79] on our interictal data set. We note that our data sets were not sampled with under optimal conditions for HFO detection: that the EEG sampling rate be greater than 2000 Hz and that the data be from periods of non-REM sleep [30, 34, 80]. However, some studies also analyzed HFOs in other time periods, such as wakefulness [81, 82]. Since our interictal data are sampled at 1000 Hz, Nyquist’s theorem suggests we should be able to detect frequencies up to 500 Hz. Fast ripples (FR) and ripples (R) were extracted and we considered HFOs to be the union between the FR and R detected by the algorithm. For details on the algorithm, see [30]. In addition to HFOs, we also constructed spectral-based features. We applied a multi-taper Fourier transform over sliding windows of data with a window/step size of 2.5/0.5 seconds [83, 42]. Each EEG time series was first transformed into a 3-dimensional array (electrodes × frequency × time), and then averaged within each frequency band to form six different spectral feature representations of the data. We break down frequency bands as follows:
Delta Frequency Band [0.5 – 4 Hz]
Theta Frequency Band [4 – 8 Hz]
Alpha Frequency Band [8 – 13 Hz]
Beta Frequency Band [13 – 30 Hz]
Gamma Frequency Band [30 – 90 Hz]
High-Gamma Frequency Band [90 – 300 Hz]
HFO = R & FR [80-250 Hz & 250-500 Hz]
This resulted in a spatiotemporal heat map for each frequency band of each electrode’s spectral power over time. For HFOs, the output was a binary raster plot, representing whether HFOs are present over time or not, which was then into a HFO rate per electrode.
2.5 Baseline features – graph analysis of networks
There are many ways to measure connectivity in iEEG data through the use of graph analysis. Specifically, we computed a time domain model using Pearson correlation (equation 1) and a frequency domain model using coherence (equation 2). In the equations, (i, j) are the electrode indices, Cov is the covariance, σ is the standard deviation, f is the frequency band, and G is cross-spectral density. Note that these A’s are not the same as one would compute in a dynamical system model. For each network-based feature, a sliding window/step size of 2.5/0.5 seconds were used, resulting in a sequence of network matrices over time resulting in 3-dimensional arrays (electrodes × electrodes × time).
From each network matrix, we computed the eigenvector centrality [42, 41], and the in-degree [40] features of the network for each electrode across time. Centrality describes how influential a node is within a graph network. In-degree is the weighted sum of the connections that connect to a specific node. Both features are potential measures that attempt to capture the importance of a specific electrode within an iEEG network. We produced a spatiotemporal heat map of electrodes over time of the eigenvector centrality and the in-degree for all datasets.
2.6 Experimental design
Specifically, we tested if fragility localized the clinically annotated SOZ better in successful surgeries, and worse in failed surgeries compared to other proposed features. For fragility and all baseline features, electrodes with extreme activity deviating from the average were hypothesized as part of the EZ. After looking at the spatiotemporal fragility heatmaps of many patients, we determined if fragility could be quantified in a way that could highlight the differences between clinical outcomes, such as surgery success, CC, and Engel score. For every patient, the feature signals of the SOZ and SOZC (SOZ complement) were computed for every iEEG snapshot, averaged over electrodes. To compare spatiotemporal heatmaps across features, we computed a confidence statistic (CS) that should be high for success outcomes and low for failures for “good” features. We computed a CS for each patient, which we then stratified and compared population CS distributions across surgical outcomes (S vs F), CC (1-4), Engel score (1-4), gender (M vs F), handedness (R vs L), epilepsy onset age (years), and age during surgery (years). We expected that fragility could capture a trend of decreasing confidence as CC and Engel score goes from 1-4. For each clinical covariate group, we measured the effect size difference via bootstrapped sampling, and the statistical p-value between the CS distributions. We hypothesized that: i) fragility would have an effect size difference significantly different from zero when comparing success vs failed outcomes, ii) in addition, this effect size would correlate with meaningful clinical covariates, such as CC and Engel score and iii) both the effect size and p-value would be better than the proposed baseline features.
2.7 Feature evaluation using a confidence statistic
The fragility and all baseline features proposed generated a spatiotemporal heatmap using EEG snapshots of either ictal or interictal data (outlined in Figure 3). In order to evaluate each feature, each spatiotemporal feature heatmap was converted into a CS, which is a metric determining the degree of confidence in the proposed surgical plan. The higher the CS (closer to 1), the more likely the feature indicated a successful surgery, and the lower it was (closer to 0), the more likely the feature indicated a failed surgery. We tested these hypotheses stated above by computing a CS from each feature heatmap for each patient, and estimated the distribution differences of the CS between various clinical covariates. The CS of each feature heatmap was computed by first normalizing the ictal period to 100 samples for all ictal snapshots of data because the length of the ictal can vary across different events. Non-ictal data are kept the same length across patients. If there were multiple ictal, or interictal snapshots, then they were combined by the median of the normalized heatmaps. Next, we partitioned the heatmap into a SOZ and SOZC as seen in Figure 3. This formed the two sets of signals that represent the spatiotemporal feature values of the SOZ set vs the SOZC set of electrodes. Finally, the ratio of the average feature values in the SOZ to the SOZ + SOZC was computed to form the CS (as shown in Figure 3). Because HFOs are viewed as rates, we first determined the overall rate of HFOs in each electrode over the interictal data. We then normalized the rate across all electrodes and computed the average rate in the SOZ and divided by the average rate in the SOZ plus SOZC. Because not all electrodes would have HFOs detected, it was possible to get an undefined CS. In this case, we either assigned those patients a confidence of 0, or removed them from analysis. We used the analysis that gave HFOs the better result in terms of CS effect size difference between success and failure in our framework. As a result, we assigned those undefined patients a confidence of 0.
2.8 Statistical analysis
A single CS point was computed for each patient and proposed feature. We compared this CS stratified by different clinical factors: surgical outcome, Engel score, CC, handedness, gender, onset age, and surgery age. Rather than using box plots, which can have different distributions for the same summary statistics, we showed swarm plots with standard deviation bars to visualize distributions [84]. We then estimated the effect size differences between distributions in the form of Cohen’s d statistic [85]. Cohen’s d was estimated using a non-parametric permutation test on the observed data with 5000 resamples used to construct a 95% confidence interval (as seen in Figure 5) [84]. The null hypothesis of our experimental setup was that the CS derived from the spatiotemporal feature heatmaps came from the same population. The alternative hypothesis was that the populations were different (i.e. a feature could distinguish success from failed outcomes). Mann-Whitney U tests were used to determine the p-value of the effect sizes differences [86]. All p-values and effect sizes were compared across candidate features.
2.9 Code and data availability
All code related to generate the figures are at https://github.com/adam2392/fragility_in_ieeg (will be made public once published). We included jupyter notebooks written in Python. We also released the raw iEEG data, and the computed feature for each patient open-sourced and available at the INDI Retrospective Data Sharing repository in the form of BIDS-iEEG.
3 Results
We analyzed every patient’s EEG using fragility and the other baseline features, resulting in spatiotemporal heatmaps per patient for every feature. The baseline features attempted to capture activity from specific frequency bands (e.g. delta band and HFOs) and specific graph measures (e.g. eigenvector centrality and degree), which have been previously reported in the literature to correlate to the EZ [41, 40, 42, 43, 44, 39]. We considered all of these as potential EEG features of the underlying EZ in comparison to fragility.
3.1 Fragility heatmap highlights the SOZ in a successful patients
Fragility attempted to capture the susceptibility for specific electrodes (i.e. nodes within the EEG network) to cause network instability. To qualitatively assess the usefulness of fragility in localizing electrodes of interest, we first present specific examples of patients analyzed with fragility, and demonstrate how it may provide additional information in EZ localization. In Figure 4, we show three different patients with differing surgical treatments, outcomes, Engel score and CC along with their fragility heatmaps and corresponding raw iEEG data (for their full clinical description; see supplementary Excel table):
Patient_01 from NIH (primarily ECoG, successful resection, Engel score 1, clinical complexity 1),
Patient_26 from JHH (primarily ECoG, failure resection, Engel score 4, clinical complexity 3),
Patient_40 from CClinic (SEEG, failure laser ablation, Engel score 3, clinical complexity 4).
In Figure 4a, the red electrode labels on the y-axis corresponded to the clinically hypothesized SOZ electrodes; note that the red regions were part of the resected set unless otherwise noted. This figure showed the period 10 seconds before and after seizure onset. We visualized an entire ictal event for each of these patients and show their corresponding fragility heatmaps in Supplementary Figure S8. In Patient_01, the red electrodes (i.e. SOZ) showed a high degree of fragility, even before seizure onset, which is not visibly clear in the raw EEG. This patient was a successful surgery and was seizure free, and so we assume the epileptogenic tissue laid within the resected region, and that it was likely the clinicians correctly localized the EZ. When looking at the raw EEG data, Patient_01 has visual features that are readily visible around seizure EEG onset (10 seconds, halfway through the snapshot). We see onset activity that occurred in electrodes that clinicians annotated as SOZ, which corresponded to the most fragile electrodes at ictal onset. In addition, the fragility heatmap captured the onset in the ATT and AD electrodes and early spread of the seizure into the PD electrodes. ATT1 (anterior temporal lobe area) showed high fragility in the entire period before seizure onset (see Figure 4), and even in interictal periods many hours away from seizure events (see Supplementary Figure S9). This area was not identified with scalp EEG, or non-invasive neuroimaging. In this patient, both visual EEG analysis and the fragility heatmap identified a sufficient SOZ, which was included in the surgery and led to the patient becoming seizure free.
3.2 Fragility heatmap highlights the SOZC in failed patients
Patient_26 and Patient_40 both showed distinct regions with high fragility that were not in the clinically annotated SOZ (or the resected region), and were both failed surgeries. Specifically in Patient_26, the ABT (anterior basal temporal lobe), PBT (posterior basal temporal lobe) and RTG29-39 (mesial temporal lobe) electrodes were highly fragile, but not annotated as SOZ.
Patient_40 had laser ablation performed on the electrode region associated with Q2, which was shown by fragility analysis to be relatively not fragile. From seizure onset, many electrodes exhibit the EEG signatures that are clinically relevant (e.g. spiking, fast-wave activity, etc.) [16]. In this patient, the X’ (posterior-cingulate), U’ (posterior-insula) and N’/M’/F’ (superior frontal gyrus) were all fragile compared to the Q2 (lesion in the right periventricular nodule) electrode. Patient_26 had a resection performed in the right anterior temporal lobe region. Clinicians identified the RAD, RHD and RTG40/48 electrodes as the SOZ. In the raw EEG data, one can see synchronized spikes and spike-waves in these electrodes, but the patient had seizure continue despite resection. In the corresponding fragility heatmap, ABT and the RTG29-32 electrodes were highly fragile compared to the clinically annotated SOZ region. In the raw EEG data is it not visibly clear that these electrodes would be part of the SOZ.
Although visual analysis of the EEG was able to identify SOZ in Patient_01 and Patient_34, it was insufficient for Patient_26 and Patient_40, which led to a failed surgical outcome. In the context of fragility theory of a network, seizure recurrence could be due to the presence of unstable (i.e. high fragility) regions across the epileptic network. Fragility of an electrode within a certain window does not correlate directly with gamma or high-gamma power, which are traditional frequency bands of interest in localizing the SOZ [42, 21, 51, 87, 88, 89, 31, 41] (see Supplementary Figure S10). Based on this heatmap, these fragile regions would be hypothesized to be part of the SOZ, and possibly candidates for resection. The fragility maps of the interictal periods in Figure S9 also shows different electrodes being fragile compared to the SOZ.
3.3 Fragility derived confidence statistic separates surgical outcomes
From the spatiotemporal heat maps for each feature, we temporally averaged the values over electrodes in the SOZ and the SOZC. These signals represented the average nodal fragility value over time of the SOZ electrodes and the SOZC electrodes. A good representation of the underlying SOZ would be one that separated well and consistently over time in successful outcomes, but inconsistently in failed outcomes. The corresponding SOZ vs SOZC signals for patients 01, 26 and 40 are shown in Supplementary Figure S3. The SOZ vs SOZC signals for each patient were different depending on the surgical outcome of the patient. Patient_01 had a higher CS with 0.572 than Patient_26 and Patient_40 with 0.492 and 0.467 respectively. Visually, the red SOZ signal compared to the black SOZC signal is higher in the period right after seizure onset for the successful patient. On the other hand, the SOZC signal is either higher or mixed with the SOZ signal, indicating that there are highly fragile electrodes not included in the clinically annotated SOZ.
We also computed the the same SOZ vs SOZC fragility values across patients within a clinical center (i.e. UMH, UMMC, NIH, JHH, and CClinic) split by success and failed outcomes. The average and standard error is computed over all the patients within each outcome group. In Figure S4, we found that on average, patients with successful outcomes had higher fragility values 30 seconds before ictal onset to the early periods of ictal. In contrast, failed outcomes on average either have lower overall fragility values compared to their successful counterparts, or very high variability, suggesting the SOZ captured both stable and fragile regions. In all centers, we saw qualitatively that the SOZ electrodes’ fragility was higher over a window surrounding seizures, than their respective SOZC electrodes in successful outcomes. In the failed outcomes, there is not as distinct of a separation between the fragility of the SOZ vs SOZC. Since these are pooled patients for each center, it is simply a visualization of the average spatiotemporal fragility signals separated by clinically-hypothesized SOZ over an entire center. The more separated the SOZ and the SOZC feature signals, then the more confident fragility was in the annotated SOZ. This motivated us to define a simple and interpretable confidence statistic that would determine how well fragility agrees with clinicians’ SOZ in different patient situations.
In order to evaluate each feature representation, we next computed a CS, which was the ratio of the SOZ and SOZC feature signals. After each patient’s CS was computed, we visualized the entire distribution in Figure 5 stratified by surgical outcome. We computed effect sizes using a non-parametric permutation test, and p-values under the statistical paradigm described in Methods 2.8. In Figure 5a, we visualized the Cohen’s d effect size difference in the CS distributions between successful and failed surgical outcome patients. A high CS would mean a high degree of confidence in the clinician’s proposed localization, while a lower CS would suggest the opposite. There was a statistically significant difference (p-value=0.02) between the successful and failed CS distributions, and an average effect size difference between the two groups of 0.627. On average, fragility had a 0.627 higher standardized confidence in the clinically annotated SOZ in success outcomes, then in failed outcomes.
3.4 Fragility confidence statistic correlates with clinical complexity and Engel score
We next analyzed CS with respect to the CC of the patient which is a more objective measure of the patient’s seizure origin. CC is determined by what type of seizures the patient exhibit rather than the severity of the seizures, which can be subjective [90, 91]. CC was a factor determined outside of surgical outcome and one we expected would correlate with failure rate; higher CC means more difficult localizations and hence more seizure recurrences after surgery. In Figure 5b, we show the fragility CS distributions effect size differences between different CC, with CC1 as the reference distribution. CC 1 and 2, corresponding to lesional and focal temporal lobe patients respectively, were not significantly different in terms of p-value (0.798), nor did they show an effect size difference (average effect size of 0.042). This similarity between CC1 and CC2 has been seen in other studies, where lesional and temporal lobe epilepsy patients experience the greatest success rates from surgery [64, 92, 17]. However, as the CC increased, there was a greater chance that the surgery would result in failure in general with existing studies demonstrating 30-70% ictal freedom rates after surgery depending on the presence of a lesion (i.e. CC 1) [10, 21]. We saw this trend in the fragility CS distribution differences with an average effect size of 0.434 for CC3 and 0.454 for CC4. The respective p-values for CC3 and CC4 compared to CC1 were 0.239 and 0.367. We also stratified within each CC group and conducted a similar analysis. In supplemental Figure S6, we compared surgical outcomes within each CC group. Although sample sizes were considerably lower due to stratification, we still saw the effect size difference between success and failed outcomes for 3 out of 4 of the CC groups. The effect size difference between success and failure within CC1, CC2, CC3 and CC4 were: 1.07, 0.487, −0.270, and 0.676 respectively. The corresponding p-values comparing outcomes within each group were: 0.0488, 0.487, 0.707, and 0.106.
Next, we compared the CS with respect to Engel score, which acted as a further stratification of the surgical outcomes. In Figure 5, we compared CS distributions across Engel scores, and found that on average as Engel score increases, fragility confidence decreases. The effect size differences when comparing against Engel score 1 were 0.439 for Engel score 2, 0.088 for Engel score 3, and 1.189 for Engel score 4. Their corresponding p-values were 0.195, 0.849 and 8e-4. In the supplemental Figure S5, we also examined the CS across gender, handedness, onset age and surgery age to show that there were no relevant differences, as determined by effect size and statistical analysis.
3.5 Fragile electrodes during ictal onset are seen in interictal data
Next, we repeated all of our analyses for patients with interictal data (n=54). We show the corresponding interictal period fragility heatmaps in Figure S9 (the same patients are shown in Figure 4 for ictal data). We not only saw similar electrodes show up as fragile in both time periods of data, but also that the fragility spatiotemporal heatmap could highlight the most likely SOZ electrodes from interictal data that is 3-24 hours away from an ictal event. Specifically electrodes ATT1 of Patient_01 were most fragile compared to other electrodes. If we thresholded the map to 0.8, then they would be the ones with highest fragility as well. Clinicians agreed that out of the SOZ electrodes, this was the one that was the earliest onset relative to a seizure event, and most likely epileptogenic in their analysis. In addition, these electrodes were recording from a lesional region, which has a high likelihood of being epileptogenic [10, 21, 93, 17].
We also computed the CS for the interictal data. In Figure 5, we saw that the CS distribution effect size is still maintained with a value of 0.644, and a p-value of 0.03. In Figure 5b, we also saw the same correlated increase in effect size differences as CC increases. We further analyzed the outcome differences stratified by CC in supplementary Figure S7. Effect size differences between outcomes were 1.14 and 1.21 for the CC1 and CC4 group respectively. The difference between surgical outcomes were smaller in CC2 and CC3 with 0.47 and 0.38 respectively. Although the sample sizes were significantly smaller as a result of stratification, lesional and multi-focal patient outcomes had the largest differences with respect to the CS. When comparing across Engel scores, Engel score 1 had an effect size difference compared to Engel scores 2, 3 and 4 of 0.84, 1.15, 0.66 respectively.
3.6 Fragility is the best candidate feature in terms of effect size and statistical difference
Next, we summarized the results across all 21 baseline features proposed in the Methods section. In interictal data, we included an additional analysis using HFOs, which have been considered a potential biomarker for the EZ. We compared features based on their effect size differences between surgical outcomes, and the corresponding p-value statistic determined by non-parametric permutation and Mann-Whitney U testing respectively. In Figure 6 we showed a) the effect size differences and b) the pvalues for each feature using either ictal, or interictal data. A negative effect size difference meant that a feature value was lower in the SOZ compared to the SOZC. An effect size of 0, implies that the value of the feature was uniformly distributed across both the SOZ and SOZC electrodes, and there was no differences between the two.
P-values are random variables that tell us how likely our data came from the null distribution: that all feature CS came from the same population (i.e. that the feature CS cannot distinguish between clinical covariates). In Figure 6, we saw that fragility had the most statistically significant results based on p-value analysis. In terms of statistical hypothesis testing, only fragility’s CS had a p-value of 0.01 below our α level of 0.05. This told us that if fragility maps were computed from a uniform clinical population, we would have observed CS distributions this different only 1% of the time. With an α level of 0.05, we rejected the null hypothesis and determined that the fragility CS of surgical outcomes came from different populations. This implied that the fragility of the iEEG electrodes agreed with clinicians’ SOZ annotations when surgery was a success, but highlighted different electrodes when surgery was a failure. However, it was not sufficient to only look at p-values as they could be misinterpreted to imply causal effects [84, 94]. So our next step compared effect size differences between features. Figure 6 showed that the effect size difference between the success and failed CS distributions from fragility had a relatively large effect size value of 0.649 compared to the rest of the baseline features ranging from 0.007 to 0.502. Note that HFOs were not computed on ictal period data. For ictal data, the gamma and high-gamma frequency band power had the second most discriminating CS in terms of absolute effect size difference, albeit with a p-value of 0.90. This effect size observation coincided with other studies that have observed increased gamma power during seizures [29, 49, 88, 87].
We then analyzed differences across features on only the interictal data. When we included a comparison of HFOs, fragility still had the largest effect size and was the only one with a p-value below an α level of 0.05. When using HFOs, 39 of the 54 patients had an undefined CS due to the fact that no HFOs were detected. Proceeding, we could have either assumed that all 39 patients had either 0 CS (i.e. 0 confidence in the clinical localizations), or removed them from further analysis. When we included all patients with 0 CS, HFOs actually performed better with an even larger effect size. We proceeded by analyzing HFOs for patients without defined CS by defining for them a CS of 0. This implies that for 39 of those patients, HFO’s CS would recommend no surgery because it had zero confidence in the clinician’s SOZ localization. In addition to having a lower effect size compared to fragility, HFOs had a larger standard error of mean in Figure 6a. This was a result of having combined multiple snapshots of interictal data from the same patient, and the variability of the CS across different patients. HFO variability across different snapshots of data has been seen in other analyses of HFOs [34]. In supplementary Figure 5, we also saw that HFOs not only had lower effect size compared to fragility, but also did not detect any occurrences within many of clinical complexity 1 patients (i.e. CD1). These are lesional epilepsy patients, where many have successful outcomes, so we would expect a robust feature of the EZ to align highly with what clinicians annotated. When we analyzed the pvalues across all the features proposed on interictal data, we found that fragility had a pvalue of 0.03. The effect size of fragility for interictal data is 0.644, compared to the other candidate features, ranging from 0.0029 to 0.526.
When we analyzed the difference in results between ictal and interictal data in Figure 6, we saw that only fragility had a consistent effect size and p-value. In many baseline features (e.g. EVC-COH-beta, gamma, high-gamma, beta, alpha, etc), their effect size directions switched between interictal and ictal data. This implies that the spatiotemporal value of the feature does not directly correlate with a SOZ; in seizures the value may go up, but in interictal data the value may go down. In others (e.g. ID-CORR, theta, etc.), the effect sizes of interictal were significantly smaller when compared to ictal data. Although HFOs have the second largest effect size difference using interictal data, they were not able to be compared to fragility in ictal data.
4 Discussion
In summary, we presented a networked-dynamical systems based model to compute the nodal fragility of each iEEG electrode for both ictal and interictal data to attempt to localize the SOZ. We performed analyses on ictal (n=91) and interictal (n=54) data for patients gathered from five centers and concluded that fragility is the best candidate feature of the SOZ.
Visualization of fragility heatmaps demonstrated how they could be used qualitatively to assess which electrodes are most fragile within an iEEG network. A notion of network fragility is commonly seen in analysis of structural [95], economic [96] and even social networks [97]. Although we were not directly analyzing the structural nature of neuronal network, there are studies that have characterized epilepsy in terms of structural fragility. Specifically, in cellular studies [98, 99], epilepsy was caused by changes, or “perturbations” in the structural network (i.e. chandelier cell loss, or abnormal axonal sprouting from layer V pyramidal cells), which caused loss of inhibition or excessive excitation respectively; these biological changes caused downstream aberrant electrical firing. These changes in the structure can be modeled using networked-dynamical systems models [59]. Instead of analyzing a structural network’s fragility, here we analyzed a functional network, characterized by a dynamical system. Each electrode’s effect on the rest of the network was captured by a time-varying linear model that we proposed in [78]. Each node is an electrode, which is recording aggregate neuronal activity within a region surrounding the recording electrode. By quantifying the fragility of each node, we then determined how much of a change in that region’s functional connections was necessary to cause ictal-like phenomena (e.g. instability). As a result, high fragility should coincide with a region that is sensitive to minute perturbations, causing unstable phenomena in the entire network (i.e. a seizure).
The EZ in general cannot be explicitly labeled, but is presumed to lie within the resected region if the patient becomes seizure free [10, 11, 9, 8]. The current method for identifying candidate EZ regions requires capturing multiple brain images (MRI, SPECT, PET scan) and recording EEG signals from the scalp during ictal events. In some cases, the presence of a lesion in a patient’s MRI suggests a focal EZ that can be corroborated with scalp EEG and iEEG recordings and then surgically removed. Such lesional patients experience approximately 70% ictal freedom rates [17]. Even in these cases with a high likelihood of localizing the EZ, localization is not perfect. Even if localization is perfect, chronic effects of epilepsy such as kindling can cause neighboring tissue to become abnormal and epileptogenic. Current limitations for evaluating computational approaches can be largely attributed to the lack of ground-truth labels for the EZ. Different studies have attempted to predict the SOZ, or the resected zone (both approximations of the true EZ) directly [41, 40, 42, 35, 36]. However, due to the sparse recordings obtained from iEEG and the complicated nature of epilepsy, electrodes within the clinically labeled SOZ may not be part of the true EZ, especially in failed surgeries. In addition, electrodes within the resected zone may not be a part of the true EZ. At best, we can assume that in successful surgical outcomes, the EZ is an unknown subset of the SOZ and resected zone. In our study, we had no notion of ground truth for the EZ. Despite this lack of a rigorously defined EZ, we knew that successful EZ localization relied on a good representation of the iEEG data that separated EZ regions from non-EZ regions. If an ideal spatiotemporal biomarker was known for the EZ, then clinicians would have 100% success rates with surgery, and be able to resect the minimal amount of tissue necessary to cure the patient assuming the region was surgically resectable without causing significant deficit. We would be able to compute a CS that can completely separate success and failed outcomes. In order to compute an explicit CS, we substituted the clinically annotated SOZ for the EZ. We chose SOZ over the resected zone as our label of interest because i) the resected zone is difficult to precisely label without postoperative T1 MRI, and ii) the resected zone generally includes many other non-SOZ electrodes and even regions without recordings due to the sparsity of electrodes on strips, grids and depth (e.g. there can be as much as 5-10mm between electrodes). Developing algorithms to predict resected zone, or SOZ labels will at best replicate what clinicians currently do, and achieve a rate of 30-70% surgical success rate. Because of this, we opted to compare features that can be computed from the raw EEG data itself, and then use our confidence statistic framework to compare features in terms of effect sizes and statistical pvalues.
In order to analyze fragility in a retrospective study that accounted for all these factors, we setup a simple statistical framework to test its utility as a feature of the SOZ. In addition, we provided an analysis for both ictal and interictal data to determine if a feature might only be a good candidate for certain data. We hypothesized that good feature representations of the epileptic network should be able to separate surgical outcomes (i.e. have a high CS in clinically annotated SOZ for success outcomes and low CS for failed outcomes). In successful patients, we expected that the EZ is a subset of the SOZ electrodes (and hence the resected zone) because surgery was performed and the patient was seizure free. In failure patients, the EZ was either incompletely removed, or it was altogether not removed due to a variety of difficult scenarios (e.g. see Figure 1). We therefore expected that the EZ would at best be partially present within the SOZ electrodes, with the worst case being that none of the EZ is within the SOZ set. A candidate feature that modulates with respect to the true SOZ should perform well in separating surgical outcome in this framework. The ability of the CS to be able to separate surgical outcome depends on two factors: i) the value of the feature in localizing the true EZ and ii) the accuracy of the clinically annotated SOZ. The accuracy of the clinical labeling were determined by the surgical outcomes of patients. Using these criteria, the value of the tested features was compared to the only known ground-truth label: patient ictal freedom outcomes. We found that fragility had relatively high CS in success patients, and low CS in failed patients. This coincided with our hypothesis. The CS was then stratified by a variety of clinical factors, such as success/failure, clinical complexity, Engel score, gender, handedness, onset age, and age during surgery. As expected, we did not see any variability in the fragility CS due to the handedness, gender, onset age, or age at surgery of the patients. However, we saw decreasing CS as CC increased. This would be expected, since the accuracy of the clinically annotated SOZ is expected to decrease as CC of the patient increased. CC1 and CC2 were comparable, which agrees with current data suggesting that lesional and temporal lobe epilepsy having the highest rates of surgical success [68, 69, 70, 71]. CC3 and CC4 though were increasingly more different compared to CC1, which is also expected because non-temporal and multi-focal epilepsy are traditionally harder cases to treat. Furthermore, in CC4, there was only one successful outcome because these candidates were generally very difficult to localize; SOZ electrodes labeled could have been either mislocalized, or insufficient.
We also saw a decreasing CS as Engel score increased. This also aligned with what we would expect of a robust feature of the SOZ; the severity of the patient’s epilepsy recurrence after surgery should correlate with the feature’s confidence in the clinician’s annotations. Engel score 2 and 3 were not as different from each other as Engel score 4 was. The Engel score is known to have issues because the rating scale can be subjective and differently interpreted from clinician to clinician [100, 101, 102]. Scores 2 and 3 are the most subjectively mixed, while 1 and 4 are clearly successful, or failed surgical outcomes. The ILAE score is another option that has been shown to be slightly less subjective, but not all centers have adopted this scoring method. The fragility CS distributions agree with these existing problems of the Engel score. Besides CS and Engel score, we verified that the distributions did not vary with respect to unrelated patient covariates, such as gender, handedness and age. By showing that fragility varies expectantly with respect to CC and Engel score, we demonstrated that fragility is a potential feature that can capture the underlying pathology that vary with epilepsy severity.
When we looked at the fragility heatmaps of the three example patients presented in Figure 4, the maps qualitatively agree with our quantitative results. In the successful outcome, Patient_01 (CC1 and Engel score 1), the SOZ electrodes were most fragile, compared to the rest of the network. The electrodes identified as highly fragile, specifically ATT1, was also seen in the interictal fragility heatmap, suggesting that fragility captured the most likely SOZ region regardless of whether interictal, or ictal data was used. However, in Patient_26 (CC3 and Engel score 4) and Patient_40 (CC4 and Engel score 3), there were electrodes that were highly fragile relative to the rest of the network that were not included in the SOZ set (and hence not resected). The more electrodes that were not included in this set, then potentially the more likely seizures are to reoccur. When looking at the fragility heatmaps computed from interictal data, we also see similar electrodes that were not in the SOZ appear fragile ((see supplementary Figure S9), suggesting that fragility is an invariant metric interictal and ictal data.
Lastly, we compared 21 different features that have been investigated in the context of SOZ localization to fragility in the same framework. In the analysis of neural data ranging from decision making [103], to motor control [104] to epilepsy [41, 42, 29, 87, 16, 105], spectral decomposition has been used to represent the neural data in terms of its frequencies. Graph metrics like eigenvector centrality and in-degree computed from Pearson Correlation and Coherence connectivity models have been proposed before in the context of fMRI and EEG analysis of epilepsy [41, 40, 42, 43, 44, 39, 106, 107, 39, 108]. We note that some studies suggest only computing HFOs on datasets from 1AM-3AM (for non-REM sleep) and greater than 2000 Hz sampling rate [109, 34, 80]. However, we wanted to process our data anyways because of the fact that there is a large clinical trial underway to test the utility of HFOs in localizing the EZ [110]. In addition, we use an open-sourced algorithm to promote reproducibility of our results [30]. We knew that the community would want to see a comparison, and so we proceeded in a manner that would be as fair as possible, outlining the distinct differences in processing steps we took for spatiotemporal feature heatmaps versus HFO rates (see Figure 3 and S2). When compared to these other proposed baseline candidate features, fragility CS had the largest effect size difference between surgical outcomes as well as a p-value less then our set α level for both ictal and interictal data. As a result, we found that nodal fragility outperformed all 21 other candidate EZ features.
When we analyzed the spatiotemporal variability of each candidate feature over different snapshots of the same patient, we saw that nodal fragility was also relatively invariant even when comparing ictal and interictal data (i.e. the electrodes that are fragile during ictal, are also fragile during interictal). Fragility attempts to compute the susceptibility of the underlying dynamical system to become unstable; this notion does not require an ictal period to compute. This is an advantage over HFOs, since HFOs are generally restricted to only interictal data. Ictal windows are currently the gold standard in clinical practice for localizing the EZ by observing ictal onsets and semiology [17, 16]. However, having patients with electrodes implanted for long periods of time, and requiring the monitoring of multiple seizure events over many weeks carries the risk of infection, sudden death, trauma and cognitive deficits from having repeated seizures. This contributes to the large cost of epilepsy monitoring [1, 2, 3, 4, 5, 14]. It is a necessary first step to ensure that fragility has utility when analyzing ictal data, but it is also promising that its utility does not change significantly when only analyzing interictal data. If a candidate biomarker could be found that is able to provide strong localizing evidence using only interictal data, then it would significantly lower risk in the invasive monitoring procedure for epilepsy patients. In many studies, HFOs have been explored as the potential biomarker for localizing the EZ, with a prospective clinical trial underway [110]. However, results are still not yet published. In our dataset, even in interictal data, HFOs did not perform significantly better than fragility. In addition, fragility does not require the data to be collected with a high sampling rate, from specific periods (e.g. non-REM sleep), or require hyperparameter tuning. Although HFOs are a promising candidate for SOZ localization, it is not clear how specific algorithms affect the detection of these high-frequency phenomena [34]. In addition, it is not clear what is a good definition of physiological HFOs versus pathological HFOs [27, 81, 79, 111, 23], which can confound algorithms that do not make a distinction. Sampling rate, time-period of interictal data, and algorithm choice are all variable factors in performing HFO analysis, and therefore the exact algorithm and approach used herein is specified. Due to these confounding factors, more research will be needed that compares HFOs with a range of other candidate features for the purpose of EZ localization. Currently, clinical trials to examine the efficacy of HFOs in localizing the EZ are underway [112]. In future research it will be necessary to i) collect ictal and interictal data on a large population of epilepsy patients, ii) include explicit choices of hyperparameters within feature computation and iii) analyze features in a cross-validation framework. Lastly, candidate features will need to be tested in a prospective fashion that demonstrate correlation with improvement in surgical outcomes.
4.1 Limitations and Future Work
In this section, we address a few limitations of our study to motivate future experimental design. Collecting detailed epilepsy patient data across multiple clinical centers was a challenging task. Although we presented a multi-center dataset with a relatively large number of patients, we had a different number of patients from each center. In future retrospective studies, it will be necessary to obtain large datasets (n ideally > 50) that span different surgical outcomes, Engel score, and CC from each center to capture variability within sub-populations of epilepsy. We were unable to retrospectively share the raw data for some centers because our initial study design did not ask for patient consent. In future work, it is necessary for studies to proactively de-identify and share their data, so that the community can build up increasing large and heterogenous datasets for evaluation of computational EZ localization approaches. To be as transparent as possible, we presented summary information regarding statistics of the clinical population included in this study (shown in Figure S1). When performing analysis across clinical covariates, sample sizes were much smaller as a result of stratification. We tried to alleviate this issue as much as possibly by using non-parametric bootstrapping and hypothesis testing to estimate effect sizes and p-values of the CS distribution differences [84].
In this study, we performed analysis retrospectively in a single-blind fashion. Double-blind, randomized, prospective clinical studies are important to pursue as more and more features are validated as potential avenues for assisting in EZ localization. Another option would be using propensity score matching techniques on large databases of epilepsy data [113]. This would be attractive due to the large cost of engaging in prospective randomized clinical trials. However, these retrospective propensity matching efforts would require the community to actively report their data to a centralized database, and rigorously include as much clinical metadata as possible, including but not limited to: clinically annotated SOZ region, estimated resected region, handedness, gender, onset age, age at surgery, Engel score, ILAE score (if available), clinical complexity category. In our study, we released as much data as possible given HIPAA and IRB constraints. Our current study also did not track the ethnicity of patients, nor the medication history. These are all potentially important clinical covariates that can bias analyses [114, 115]. Dataset shift is an important phenomena that could occcur due to the fact that clinical centers have inherently different practices [116]. Further clinical studies on the variation of epilepsy in patients and clinical centers is necessary to arrive at rigorous definitions of the EZ.
In our evaluation framework, we did not consider feature activity over time. Future work could propose time-dependent features. We include in the supplementary a “lessons-learned” summary of how to proceed with preprocessing iEEG data and reasons why. Although we computed a CS for easy interpretation and visualization purposes, we would not suggest using this metric as a measure of localization in a real clinical setting. Rather, we provided the spatiotemporal heatmap that clinicians could interpret for themselves. High fragility would be hypothesized as likely epileptic regions, while low fragility would correspond to stable and normal regions. Compared to HFOs, clinicians would be able to use fragility to not only glean information from interictal periods, but also from ictal events; they could determine the time-scale and regions of ictal propagation, both of which are important factors in clinical localization.
4.2 Summary
In summary, we proposed a novel approach for defining the EZ based on network fragility theory. We quantified the usefulness of this feature in a simple framework that computed a confidence statistic that separated surgical outcomes. Using a five-center, 91 patient population of epilepsy patients, we compared fragility against 21 other baseline candidate features. We also included an analysis of 54 patients with matching interictal data and showed that fragility outperforms the other features (including HFOs) using only interictal data. It performed better both in statistical significance, as well as overall effect size in both the ictal and interictal data. This brings us one step closer to a rigorous network fingerprint of the EZ in epilepsy patients.
6 Supplementary Material
We include in the supplementary section, also a “lessons-learned” summary of how to proceed with preprocessing iEEG data and reasons why. See supplemental doc.
5 Acknowledgements
AL is supported by NIH T32 EB003383, the NSF GRFP, Whitaker Fellowship and the Chateaubriand Fellowship. SVS is supported by NIH R21 NS103113, the Coulter Foundation, Maryland Innovation Initiative, US NSF Career Award 1055560 and the Burroughs Wellcome Fund CASI Award 1007274. Computational resources were also provided by the Maryland Advanced Research Computing Center (MARCC). The authors would like to thank Carey Priebe for useful discussions on statistical analyses, and Sarah Kim and Rachel June Smith for helpful reviews of the manuscript. In addition, the authors thank all the anonymous reviewers for their helpful feedback.
Footnotes
↵* Co-senior authors
↵† Also reachable at adam2392 at gmail dot com. For more information, visit the NCSL website: https://sarmalab.icm.jhu.edu/. Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218
References
- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].↵
- [57].↵
- [58].↵
- [59].↵
- [60].↵
- [61].↵
- [62].↵
- [63].↵
- [64].↵
- [65].↵
- [66].↵
- [67].↵
- [68].↵
- [69].↵
- [70].↵
- [71].↵
- [72].↵
- [73].↵
- [74].↵
- [75].↵
- [76].↵
- [77].↵
- [78].↵
- [79].↵
- [80].↵
- [81].↵
- [82].↵
- [83].↵
- [84].↵
- [85].↵
- [86].↵
- [87].↵
- [88].↵
- [89].↵
- [90].↵
- [91].↵
- [92].↵
- [93].↵
- [94].↵
- [95].↵
- [96].↵
- [97].↵
- [98].↵
- [99].↵
- [100].↵
- [101].↵
- [102].↵
- [103].↵
- [104].↵
- [105].↵
- [106].↵
- [107].↵
- [108].↵
- [109].↵
- [110].↵
- [111].↵
- [112].↵
- [113].↵
- [114].↵
- [115].↵
- [116].↵