Abstract
Alzheimer’s Disease (AD) is a progressive neurological disorder in which the death of brain cells causes memory loss and cognitive decline. The identification of at-risk subjects yet showing no dementia symptoms but who will later convert to AD can be crucial for the effective treatment of AD. For this, Magnetic Resonance Imaging (MRI) is expected to play a crucial role. During recent years, several Machine Learning (ML) approaches to AD-conversion prediction have been proposed using different types of MRI features. However, few studies comparing these different feature representations exist, and the existing ones do not allow to make definite conclusions. We evaluated the performance of various types of MRI features for the conversion prediction: voxel-based features extracted based on voxel-based morphometry, hippocampus volumes, volumes of the entorhinal cortex, and a set of regional volumetric, surface area, and cortical thickness measures across the brain. Regional features consistently yielded the best performance over two classifiers (Support Vector Machines and Regularized Logistic Regression), and two datasets studied. However, the performance difference to other features was not statistically significant. There was a consistent trend of age correction improving the classification performance, but the improvement reached statistical significance only rarely.
1. Introduction
Alzheimer’s Disease (AD) is a progressive neurological disorder in which the death of brain cells causes memory loss and cognitive decline. The progression of the neuropathology in AD starts long before clinical symptoms of the disease become apparent [1, 2, 3, 4, 5]. Also, the symptoms become progressively worse, and much effort has been placed on the early diagnosis of the AD. Related to this, Mild Cognitive Impairment (MCI), defined as a transitional phase from cognitive changes of normal aging to those typically found in dementia, is an important construct [6]. Subjects with MCI present a high risk of developing AD, but still, most people with MCI will not progress to dementia (or AD) even after 10 years of follow-up [7, 8]. Thus, identifying MCI subjects who convert to AD can be crucial for the effective treatment of AD.
Neuroimaging techniques have shown promise as tools for presymptomatic AD detection [9, 10]. Much research has been focused on T1-weighted Magnetic Resonance Imaging (MRI). It is one of the most widely studied imaging techniques [11] because it is completely non-invasive, highly available, inexpensive compared to positron emission tomography and has an excellent contrast between different soft tissues. Over the past few years, many potential MRI markers, such as the whole brain, hippocampal, and entorhinal cortex atrophy, have been shown to have diagnostic value [12]. Also, these markers have been used as the features for Machine Learning (ML) algorithms trying to predict MCI-to-AD conversion.
Indeed, there has been a surge of proposed ML algorithms for automatically predicting the future conversion from MCI to AD based on MRI (e.g., [13, 14, 15, 16]). This is partly driven by the free availability of large, high-quality datasets such as ADNI 1. However, the principal focus has been in the development of new ML techniques, and their comparative evaluation has received much less attention. In particular, ML algorithms have used different types of feature sets extracted from MRI, including hippocampal volumes, volumes of the entorhinal cortex, cortical thickness measures, as well as voxel-based morphometry (VBM) features (e.g., [17, 18, 19, 20, 21] and [22] for a recent review). Despite that, systematic studies of advantages/disadvantages of various feature sets have been limited so far, and existing studies do not allow to make definite conclusions. To add to the confusion, high dimensional feature sets, such as cortical thickness or voxel-based morphometry, must be coupled with dimensionality reduction technique such as averaging the values within a brain region, Principal Component Analysis (PCA) or feature selection (see [23] for a review).
Existing comparisons between different feature representations do not provide a clear answer to the question we are interested in: “Is there a preferred representation of MRI for AD-conversion prediction?”. There are multiple reasons for this. The comparisons have been geared to the AD vs. control classification problem ([24, 25, 26]), they have not included voxel-based representations [24, 27], they have utilized very short follow-up (18-months [27, 28]), they have been based on a single learning algorithm [27, 29] and/or have had highly unbalanced pMCI and sMCI classes (in [29] 149 of 165 MCI subjects converted during the 4-year follow-up that is in stark contrast to conversion rates reported in other analyses [8]). An early and important study [28], which we want to highlight, compared various feature representations including hippocampal volumes, cortical thickness, and VBM with and without regional averaging. No feature representation in this study managed to perform significantly better than chance. This somewhat disappointing result could be because 1) the methods were early ones, mostly geared to the much easier normal control vs. AD sub-classification problem, 2) the dataset was smaller than the one currently available, and 3) the MCI non-converter was somewhat arbitrarily defined as a subject who did not convert in 18 months period. Moradi et al. [30] evaluated their method over the same dataset as [28] managing to obtain significantly better performance than the chance level, pointing to the reason 1) as the most significant cause of the improvement.
Since [28], we can a find few studies of different feature representations presenting partially conflicting results. As an example, [21] found that the prognostic efficacy of hippocampus volumetry was better than combined regional volumetrics in two commercially available brain volumetric software packages for MCI conversion prediction. On the other hand, Gaser et al. [17] have demonstrated the superiority of their voxel-based brainAGE approach over the hippocampus volume biomarker and Westman et al. have emphasized the importance of having a complete set of regional features [27, 24]. Some researchers have opted to study feature selection, either supporting [20, 31] or opposing [32] data-driven feature selection. The comparisons of different automatic algorithms for hippocampal [33] and entorhinal cortex volumetry [34] have indicated that the algorithm-choice did not affect the classification accuracy. Intracranial volume adjustment to regional volumetry appears to only have subtle effects to the conversion prediction accuracy [35, 27]. Finally, it has been demonstrated that the neurophychological test scores are the best predictors of conversion, but combining them with MRI information leads to improved prediction accuracy [30, 36].
To close this information gap, we asked what type of feature representations are the best for the MRI-based AD-conversion prediction. We used follow-up period of 3 years to define the AD-conversion, twice longer than in [27, 28]. We evaluated the performance of various MRI features, including VBM-style, voxel-based features [17], coupled with feature preselection [30] or PCA-based dimensionality reduction, hippocampus volumes, volumes of the entorhinal cortex, and a complete set of regional volumetry, surface area, and cortical thickness measures extracted by FreeSurfer. This complements earlier studies which did not include voxels-based representations [27, 21]. We additionally evaluated age removal [30, 37], which have been found to improve the prognostic efficacy of ML-based MRI biomarkers. Moreover, we used two different classifiers (Support Vector Machines, SVM, and Regularized Logistic Regression, RLR) for reducing the classifier specificity of the conclusions and trained them applying a repeated 10-fold cross-Validation (CV) with a sound statistical inference to compare the methods, which can be seen as an improvement of separate training and test sets in [28].
2. Material and methods
ADNI data Data is collected from the the Alzheimers Disease Neuroimaging Initiative (ADNI) public database, available at adni.loni.usc.edu. The ADNI initiative was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early Alzheimers disease (AD). For up-to-date information, see www.adni-info.org.
ADNI material considered in this work included all subjects from ADNI1 for whom baseline MRI data (T1-weighted MP-RAGE sequence at 1.5 Tesla, typically 256 x 256 x 170 voxels with the voxel size of approximately 1 mm x 1 mm x 1.2 mm) and sufficient follow-up information were available. We focused on the classification of MCI individuals based on their future diagnosis (AD or not AD) and, therefore, MRI scans were all obtained at the baseline visit.
Two flavors of this dataset were evaluated. The first dataset, Quality Control (QC) dataset, included 183 MCI subjects whose FreeSurfer 4.3 MRI segmentations had passed the complete quality control. The second one, Non QC dataset, included the complete dataset of 264 MCI subjects without any quality control.
The reason for evaluating the two different sets was to study if the quality control yielded an improvement in the data analysis. Note that the QC dataset was a subset of the non-QC dataset.
Following [30], a subject was considered as a progressive MCI (pMCI) if diagnosed as MCI at baseline and the diagnosis changed to AD during the 3-year follow-up period. The subject was considered as a stable MCI (sMCI) if diagnosed as MCI at baseline and the diagnosis remained as MCI during the follow-up. The minimum length of follow-up was 3 years and the subject was excluded from the study if she converted after the 3 year follow-up, the diagnosis fluctuated after the 3-year follow-up period, or less than 3-years of follow-up information was available. Table 1 lists the main characteristics of the subjects of each dataset and the list of Roster IDs of the included subjects and their diagnostic categories are available in the supplement.
2.1. Image preprocessing
Table 2 details the feature representations we investigated and their respective number of features. Hippocampus volumes consisted of left and right hip-pocampal volumes. Hippocampus + Entorhinal volumes consisted of left and right volumes of hippocampus and left and right volumes of entorhinal cortex. We considered both raw volumes as well as volumes normalized by the intra-cranial volume (ICV) as it is still unclear if the normalization by ICV is beneficial for the prediction task [35, 27]. Region based features included a complete set of 257 regional cortical thickness, surface area and volume measures provided by FreeSurfer 2, 3. We note that this set of features included also ICV.
Freesurfer 4.3 software was employed for the extraction of hippocampus and entorhinal cortex volumes as well as region features. Particularly, the FreeSurfer 4.3 processing results available at the ADNI website were used (UCSF Cross-sectional Freesurfer version 4.3), and the description of the pipeline and the QC procedure can be found at 4. The rationales for using the processing results provided by ADNI was to ensure that the processing pipeline was a standard one, the processing results are readily available to other researchers, and the quality control, independent from the authors of this study, has been performed. We note that albeit different versions of FreeSurfer can result in different segmentations, the classification results based on different software versions have been found to be the same [34].
Voxel-Based Morphometry (VBM) based features consisted of 29852 gray matter density values from the VBM style preprocessing by the VBM8 software. In brief, the MRIs were preprocessed into gray matter tissue images in the stereotactic space as described in [17, 30], smoothed with the 8-mm FWHM Gaussian kernel, resampled to 4 mm spatial resolution, and masked into 29852 voxels. In the Moradi set of features, VBM features were further processed through the feature selection method of [30]. This method applies MRIs of AD and NC subjects to select features for MCI classification through a repeated application of the elastic net penalized linear regression. We applied the ADNI data from 231 (182) normal controls and 200 (126) AD subjects for this feature selection with the non-QC (QC dataset). We reduced the number of VBM features also using principal component analysis (PCA). For this, we retained the PCA components that explained 99 % of the variance.
We further evaluated the representations with and without the age correction. The age correction may be important as the effects of normal aging on the brain structure partially overlap with the effects of AD [38, 37]. We applied the age correction procedure of [30]. This method estimates the age effect by a linear regression for each feature separately based on the MRIs of normal controls (231 normal controls with the age range from 55 to 90 years of ADNI) and then adjusts the features of the MCI subjects based on the estimated model.
2.2. Validation and test procedure
For the implementation and evaluation of the classification methods, we performed a repeated and nested 10-fold Cross Validation (CV). In the outer CV loop, data was split in 10 different folds from which one fold at time was designated as the test fold (for performance evaluation) and the nine remaining folds were used for classifier training. The train/test cycle was repeated with each fold once as the test fold. In the inner CV loop, each train fold was, itself, split into 10 validation folds from which one part was used to select the classifier hyperparameters. The optimal hyperparameters were selected evaluating either the classification accuracy (ACC, number of correctly classified samples over the total number of samples) or the Area Under the receiving operating Curve (AUC) [39]. The nested CV was repeated 10 times, each with a different randomly selected folding scheme, to minimize the effect of a particular folding scheme to the results. Also, the hypothesis test we used to compare different representations requires the repeated use of CV.
To study the classifier performance, we considered several metrics: AUC, ACC, Sensitivity (SEN, number of correctly classified pMCI subjects divided by the total number of pMCI subjects and Specificity (SPE, number of correctly classified sMCI subjects divided by the total number of sMCI subjects). We selected AUC as our principal performance measure as it is insensitive to the class-imbalance whereas ACC can be strongly affected by the class-imbalance.
2.3. Classifiers
To evaluate each feature set, we considered two types of widely used supervised learning classifiers: Support Vector Machine (SVM) [40] and elastic-net Regularized Logistic Regression (RLR) [41]. Accessible description of these learning methods can be found in [42]. For the SVM implementation, we used the Python open source library Scikit-learn (http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html), which is based on the LIBSVM implementation (https://www.csie.ntu.edu.tw/~cjlin/libsvm/). For the RLR classifiers, we applied the GLMNET Python library (https://web.stanford.edu/~hastie/glmnet_python/), which solves the resulting penalized optimization problem by a coordinate descent algorithm. We note that both of these learning algorithms tolerate high-dimensional data via reg-ularization and are therefore suited for the cases where the number of features is higher than the number of subjects. Especially, elastic-net includes an L1-penalty, which leads to feature selection embedded to the classifier learning [43]. A large majority of supervised learning techniques have utilized these learning algorithms [22] and a comparison of different classification algorithms MCI-to-AD prediction is available in [44]. We note that we did not include Random Forests [45] as these are not straight-forwardly suitable for high-dimensional small-sample problems and the computation time and memory requirements for nearly all implementations would be prohibiting for the voxel-based features (however, see [46]).
For the case of the SVM classifier, we decided to use the linear SVM (we have also analyzed the possibility of using a RBF (Radial Basis Function) kernel, however, experimental results showed similar performances). In this way, we had to select only the soft margin parameter, C, whose value was explored among the set {10−5,10−4,10−3,10−2,10−1, 1,10,10,10} (see [40] for notation). Despite considering the linear SVM, its implementation was carried out in the dual space, precomputing a linear kernel; in this way, we simplified the calculations and reduced the computation time with the high dimensional feature representations, such as VBM ones.
For the RLR classifier, using the notation of [41], we set the parameter of the elastic net α to 0.5, just in between lasso (α=1) and ridge (α = 0) regularization. The principal regularization parameter of the RLR (λ), which sets the balance between the regularization and the data terms, was chosen among the set of values {10−10,10−4,10−3, 5 · 10−3,10−2, 5 · 10−2,10−1, 5 · 10−1}.
Finally, as a step prior to training the classifiers, we normalized the data by removing its mean and scaling it to the unit variance.
2.4. Statistical test
To compare the AUC values provided by different approaches, we applied the corrected resampled t-test [47]. The problem in applying standard statistical methodology, such as uncorrected t-test to assess the differences between AUCs is that r × k AUC values from in a k-fold CV repeated r times are not statistically independent. Instead, the corrected resampled t-test assumes dependency among the AUCs in a k-fold CV repeated r times and, therefore, it allows to statistically compare two mean AUC values by correcting the variance estimation. The corrected resampled t-test can be seen as an improvement over the 5x2 CV of [48] and McNemar’s test for the classification accuracy [47]. Although the test was developed for the classification accuracy, it is as well applicable for testing the differences between AUCs.
To describe the test formally, let n1 and n2, respectively, denote the number of instances used for training and testing in each fold, and aij and bij represent the AUCs of the i-th fold and j-th run of the method A and B with i = 1,…, k and j = 1,…, r. Denoting the estimated mean and variance values of the differences between methods A and B by and i.e., we can estimate the statistic of the test, t, as:
The statistic t follows a student’s t-distribution with kr - 1 degrees of freedom. In our case, r = k = 10.
3. Results
Tables 3, 4 and 5 show, respectively, the results for the QC dataset and non-QC datasets using the AUC for model selection and the results of the non-QC dataset when the best model was selected using ACC. In particular, each table includes for the SVM and RLR classifiers the values of AUC (area under the curve), ACC (accuracy), SEN (sensitivity), SPE (specificity), as well as three p-values from hypothesis tests comparing the AUCs: pAge (comparing age removed features vs. non age removed features), pHippo (comparing hippocampus features with the remaining features for the age removed case) and pClass (comparing SVM vs. LR results over the same set of features).
The AUC values of region features were the highest in all the experiments. However, the performance improvement over the hippocampus feature set, which was our baseline, did not reach the statistical significance and these improved AUCs need to be interpreted with care. In the particular case of the non-QC dataset and the RLR classifier, the regions feature set produced significantly higher AUC than hippocampus volumes.
Figure 1 depicts the ROC curves for the different feature sets under study for the RLR classifier in the non-QC dataset. Focusing on the center of these curves (see the panel 1b), we can corroborate that the region feature set appeared superior, but the performance differences were small. To avoid crowding, the ROCs of the PCA voxel feature set was not visualized as it always performed worse than the voxel features without PCA. For similar reason, the figure only displays the ROC curves of raw Hippocampus and Hippocampus + Entorhinal cortex volumes and not the ICV-normalized ones. The same principle will be followed in later figures.
Regarding the use of two different classifiers, differences between AUCs of SVM and RLR were not significant. However, SVM yielded low specificity values and the relation between SPE and SEN was more balanced with the LR classifier. Because of this we studied whether the use of AUC as the model selection criteria contributed to this imbalance with the SVM classifier. Using ACC as a model selection criterion avoids this SPE/SEN imbalance, as can be seen in Figure 2 where the specificity values are compared between ACC and AUC based model selection. As the comparison of Tables 4 and 5 reveals, the final AUC values did not markedly differ between the two model selectors.
We evaluated the effects of age removal on the feature sets. For this purpose, Figure 3 shows a detailed analysis of the advantages of removing the age effects. As a result, classification scores improved for every age removed effects feature set (see the panel 3c). However, as visible in Tables 3 and 4, significant improvement (p-value < 0.1) was observed only for hippocampus and hippocampus + entorhinal volume feature sets.
The differences between the AUCs of raw and ICV-normalized hippocampus and hippocampus + entorhinal volumes were not significant. Surprisingly, the raw volumes performed slightly better in terms of AUC within each dataset. However, this result agrees with findings in [35, 27] and it is not central for the purposes of this work to analyze the potential reasons for this result.
Finally, Figure 4 shows the differences between QC and non QC datasets when age effects were removed. As expected Hippocampus and Hippocam-pus plus Entorhinal volumes were benefited from the Quality Control process, whereas remaining features sets resulted in better performances when all the available data were used.
4. Discussion
In this work, we compared six different feature representations of MRI for predicting the AD conversion in MCI subjects. The feature sets we studied varied from high dimensional feature sets produced by VBM via regional cortical thickness, surface area, and volumetry to simple and easily interpretable features such as hippocampus and entorhinal cortex volumes (see Table 2). We addressed the feature representations using two learning algorithms, SVM and RLR, and with several metrics, AUC, ACC, SEN and SPE, that gave a reliable insight into the relative performance of different feature sets. AUC was selected as the principal figure of merit, due to its insensitivity to the class imbalance (note that the datasets contained twice the number of pMCIs (subjects who converted to AD) compared to sMCIs (subjects who remained as MCIs)). The evaluation process was carried out with a nested fold CV repeated 10 times ensuring the insensitivity of the conclusions to random train/test division of the holdout method used previously [28]. Selecting the parameters of the classifiers inside nested CV ensures that there are no biases towards particular feature representations due to arbitrarily selected classifier parameters.
We found that age-corrected regions feature set (see https://github.com/MartaGomez/Regions-list-/wiki/Regions-list for a detailed description) outperformed the remaining feature sets, specifically in AUC, even though the improvement did not reach statistical significance. This result suggests that regions based features were equal or better predictors than the left and right hippocampal volumes (HV) alone (which were included in the region feature set). This is interesting as a recent study [21] concluded that HV had the highest AUC among a set of individual regional volume features and was better in terms of the prognostic efficacy of combining various volumetrics. Their experimental setting was similar to the one analyzed here, however, with three main differences. First, removing age related effects from MRI data was not considered; second, the set of pMCI patients was about half of ours; and, third, the combined volumetric analysis did not consider measures such as surface area or cortical thickness. This can explain the improvement in the best classification accuracy from 69 % of [21] to 80 % in the present study.
Voxel-based representations did not perform well in this study when coupled with standard feature reduction techniques (elastic-net or PCA). This was in contrast to a recent data-analysis competition, where the goal was to classify subjects into NC, MCI, and AD categories based on MRI [26]. However, as multiple factors have effect to a performance of an approach in a data analysis competition, definite conclusions on feature representations cannot be made based on such competitions. However, also in our own experience, voxel-based methods, coupled with elastic-net feature selection, perform well in classifying between NC and AD or NC and MCI [31]. These discrepancies may suggest that NC vs. MCI (or AD) classification and AD-conversion prediction have different characteristics. Further, we note that feature pre-selection based on AD and NC data suggested by Moradi et al. [30] improved the conversion prediction accuracy markedly.
Retico et al. found that the voxel based VBM features best discriminate between sMCI and pMCI after applying Recursive Feature Elimination (RFE) [20]. However, again, the maximum accuracy in [20] was much lower than the accuracies in the present study and pMCI vs. sMCI classifiers were trained only using AD and NC subjects that may explain this. Additionally, the statistical framework was incomplete as no hypothesis testing was done and the exact definition of stable MCI class remained unclear. Other works, such as [18], concluded that the combination of different feature representations resulted into a better classification accuracy than one representation alone. Again, the classification accuracies were lower than in the present work. Moreover, [18] selected classifier hyperparameters based on test data that may cause upward bias in the reported accuracies [15].
It is important to point out that while our classification accuracies were better than those in the studies reviewed above, the performance measures are not directly comparable because different definitions of pMCI and sMCI. In fact, this is a problem that complicates the comparison of ML methods for this particular application and it is reviewed in further length in [22]. Namely, the definition of sMCI subject based on a certain cutoff (say 3 years) is problematic as this simple criterion would place a subject who received an AD diagnosis 4 years after the baseline visit into the sMCI category. Our view is that this would create unrealistic heterogeneity into the sMCI class and therefore tracking subjects’ status after the cutoff is necessary (if possible). We have populated our sMCI category based on all the information available by ADNI.
Regarding the used ML methods, RLR provided, in general, similar AUC values than SVM, but had an advantage of higher specificity (it classified sMCI cases much better than the SVM did). SVM had a tendency of overpopulating the pMCI class. However, in the case of SVM, low specificity seemed to depend on the using AUC as the criterion for the hyperparameter selection. The values in Table 5 reveal how selecting the hyperparameters instead through ACC resulted in an overall improvement of specificity with a small loss of sensitivity.
This is an interesting phenomenon, as it seems to be a problem of a specific class of learning algorithms, which invites further research. However, as this issue is not central to the goals of this work, we do not analyze it further.
There were no significant differences between the classification accuracies or AUCs obtained with non-QC and QC datasets. However, the small differences between the two datasets were as expected as shown in Figure 4. For Hippocampus and Hippocampus and Entorhinal volumes, the QC was moderately useful whereas for the Moradi and Voxel based features it was moderately detrimental. This is as expected since the QC was based on Freesurfer segmentations (as Hippocampus and Entorhinal volumes) but the voxel-based and Moradi features were not. Interestingly, for region based features (also based on Freesurfer segmentation), the QC seemed not to influence the performance of the classifier.
It is remarkable that the age removal seem to be a key for better performances. As Figure 3 illustrates, age removal always led to better classification performances, although the improvements were not always statistically significant. This agrees with a recent work of [31] which demonstrated the same for NC vs. MCI classification.
5. Conclusion
This paper evaluated the performance of various types of MRI features for the future AD conversion prediction and it also analyzed the performance of each feature set over two classifiers (Support Vector Machines and Regularized Logistic Regression) and with and without applying an age correction process.
Experimental results showed that regional features consistently yielded the best performance, although the performance difference to other features was not statistically significant. Besides, the age removal seemed to be a key for better performances, but the improvement reached statistical significance only rarely.
Acknowledgments
Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimers Association; Alzheimers Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics.
J. Tohka’s work was supported by the Academy of Finland and V. Gómez-Verdejo’s work has been partly funded by the Spanish MINECO grant TEC2014-52289R and TEC2016-81900-REDT/AEI.
Footnotes
↵1 Information and data can be found at adni.loni.usc.edu
↵3 Originally, this set included 274 measures. We selected a subset of 256 regions from the aforementioned 274 measures discarding the regions that presented missed data. A more detailed description of the 256 features is provided in https://github.com/MartaGomez/Regions-list-/wiki/Regions-list.
↵4 https://adni.bitbucket.io/reference/docs/UCSFFRESFR/UCSFFreeSurferMethodsSummary.pdf