Abstract
Cognitive ability is a complex product of brain processes, yet it remains unknown how brain structure and function together lead to individual differences in developing cognitive skills. We performed multimodal neuroimaging in 1601 youths age 8-22 on the same 3-Tesla Magnetic Resonance Imaging scanner with contemporaneous neurocognitive assessment. Across age groups, high performers had larger volumes, greater gray matter density, lower mean diffusivity and lower cerebral blood flow, compared to low performers. These effects, which varied by region, were robust in males and females across age groups, but bigger in females for volume and in adult males for gray matter density and cerebral blood flow. Amplitude of low frequency fluctuations and regional homogeneity values positively related to performance in males but not in adult females for most regions. Regions showing strongest associations with performance include baso-striatal (thalamus), limbic (hippocampus), frontal (orbital and midfrontal) temporal (midtemporal), Parietal (precuneus and superior parietal) and occipital cortex (fusiform and lingual). Cross-validated regularized regressions of brain parameters combined explained ∼20% of performance variance. Our cross-modal results indicate that abundance and integrity of neural tissue, as well as the maintenance of low energy metabolism while at rest, combine to optimize cognitive performance in humans.
The relation between brain volume and cognitive performance has received extensive investigation, although its magnitude is debated; estimates of variance in cognitive measures explained by volume range from ∼3% to >30% (Witelson, Beresh, & Kigar, 2005; Gignac and Bates, 2017; Nave et al., 2018; Pietschnig et al., 2015). Advanced neuroimaging offers additional parameters of brain structure and function, but few studies linked such parameters to cognitive performance. Ritchie et al. (2015) compared the relationship of volume to performance with more complex neuroanatomic parameters such as cortical thickness and found that volume accounted for the largest share of the variance (around 12%). Ryman et al. (2016) applied graph-theory analyses to volumetric data and reported that in males a latent factor of fronto-parietal gray matter related to general cognitive abilities, while in females it was white matter efficiency and total gray matter volume, with no specificity. Studies with relatively small samples have linked some diffusion tensor imaging (DTI) parameters to performance (e.g., Schmithorst, Wilke, Dardzinski, Holland, 2005; Qiu, Tan, Zhou and Khong, 2008; Genç et al., 2018). A study on a larger sample (N=72) applied graph theory to DTI data and reported sex differences, with females having greater local efficiency (Yan et al., 2011). Studies relating cerebral blood flow (CBF) to performance were done in the elderly, showing that age-related decline in CBF is related to performance decline (Hshieh et al., 2017, Rane et al., 2018). Resting-state functional MRI (rs_fMRI) was examined in relation to cognitive performance in a small sample (Pamplona et al., 2015), and in a larger sample (N=79) by Vakhtin et al. (2014), who examined both resting state and task-activated relations to connectivity. They reported that the regions involved in task related networks included bilateral medial frontal and parietal cortex, right superior frontal lobule, and the right cingulate gyrus. As part of multivariate measures in the Human Connectome Project (Smith et al., 2015), Finn et al. (2015) showed that connectivity profiles predicted “levels of fluid intelligence” and that “the same networks that were most discriminating of individuals were also most predictive of cognitive behavior”. Indeed, Yoo et al. (2018) reported, across several data sets, that models trained on task data outperformed those trained on rest data in predicting performance. Furthermore, models trained to predict attention performance were effective in predicting not only across samples (Fong et al., 2019), but also reading recall (Jangraw et al., 2018). Notably, Greene, Gao, Scheinost, & Constable (2018) demonstrated, in two large data sets, that predictive models built from task fMRI data outperform models built from resting-state fMRI. Finally, Dubois, Galdi, Paul, & Adolphs (2018) were able to predict up to 20% of variance in general cognitive performance based on resting-state connectivity metrics in a large sample (n=884) from the HCP. However, each of these studies examined individual parameters of either structure or function and there is need for simultaneous examination of neuroanatomic, neurophysiologic and connectivity parameters in relation to cognitive performance. Such simultaneous examination will allow gauging the relative contribution of the anatomic and physiologic parameters to cognitive performance.
We conducted multimodal neuroimaging in a prospective sample of 1601 youths age 8 to 22, all studied on the same high-field (3 Tesla) scanner with contemporaneously obtained measures of cognitive performance, as part of the Philadelphia Neurodevelopmental Cohort (Gur et al., 2012; Calkins et al., 2015). The methods of sample ascertainment and the detailed neuroimaging protocols have been published (Satterthwaite et al., 2014) and are summarized in the Methods. Multimodal neuroimaging yielded regional measures of gray matter (GM) and white matter (WM) volume and GM density (GMD) from T1-weighted scans, mean diffusivity (MD) from diffusion tensor imaging (DTI), resting state cerebral blood flow (CBF) from arterial spin-labeled (ASL) sequences and amplitude of low frequency fluctuations (ALFF) and regional homogeneity (ReHo) measures from rs_fMRI. The methods for image acquisition and processing and for obtaining these brain parameters were detailed in previous publications (Gennatas et al., 2017, Ingalhalikar et al., 2014, Satterthwaite et al., 2014a,b), and are briefly summarized below. The neurocognitive assessment provided measures of accuracy and speed on multiple behavioral domains. Since available literature primarily examined general intellectual functioning, we used as primary cognitive measure the comparable score from the battery, which is a factor score that summarizes accuracy on executive functioning and complex cognition (Moore et al., 2015).
MATERIALS AND METHODS
Participants
Participants for the Philadelphia Neurodevelopmental Cohort (PNC) were recruited from the Children’s Hospital of Philadelphia (CHOP) pediatric clinics throughout the Delaware Valley as described in Calkins et al. (2015). A subsample of 1,601 participants (out of the 9498 PNC sample) underwent multimodal neuroimaging as described in Satterthwaite et al. (2014). Of these, 340 were excluded for medical disorders that could affect brain function, such as current use of psychoactive medications, prior inpatient psychiatric treatment, or an incidentally encountered structural brain abnormality. Sample size was further reduced for some modalities upon quality assurance procedures, most for excessive motion. All participants underwent psychiatric assessment (Calkins et al., 2015) and neurocognitive testing (Gur et al., 2012, 2015).
Cognitive measures
Cognitive performance was assessed with the Penn Computerized Neurocognitive Battery (CNB). The CNB consists of 14 tests adapted from tasks applied in functional neuroimaging to evaluate a range of cognitive domains (Gur et al., 2010, 2012, 2014; Moore et al., 2015; Roalf et al, 2014). These domains include executive control (abstraction and mental flexibility, attention, working memory), episodic memory (verbal, facial, spatial), complex cognition (verbal reasoning, nonverbal reasoning, spatial processing), social cognition (emotion identification, emotion intensity differentiation, age differentiation) and sensorimotor and motor speed. Accuracy and speed for each test were z-transformed. Cognitive performance was summarized by a factor analysis of both speed and accuracy data (Moore et al., 2015), which delineated three accuracy factors corresponding to: 1) executive function and complex reasoning, 2) episodic memory, and 3) social cognition. The first factor was used as the measure of cognitive performance in all analyses.
Neuroimaging
All MRI scans were acquired on a single 3T Siemens TIM Trio wholebody scanner located in the Hospital of the University of Pennsylvania. Signal excitation and reception were obtained using a quadrature body coil for transmit and a 32-channel head coil for receive. Gradient performance was 45 mT/m, with a maximum slew rate of 200 T/m/s.
Structural
Parameters of brain anatomy were derived from volumetric scans (T1-weighted) and from diffusion tensor imaging.
Volumetric MRI
Brain volumetric imaging was obtained using a magnetization prepared, rapid-acquisition gradient-echo (MPRAGE) sequence. Receive coil shading was reduced by selecting the Siemens prescan normalize option, which is based on a body coil reference scan. Image quality assessment (QA)was performed both by visual inspection and with algorithms to detect artifacts such as related to excessive head motion.
To maximize accuracy, advanced structural image processing, quality assurance and registration procedures were employed for measurement of the cortical volume and gray matter density. Estimation of brain regions used a multi-atlas labeling approach. A set of 24 young adult T1-weighted volumes from the OASIS data set (Marcus et al., 2007) were manually labeled and registered to each subject’s T1-weighted volume using the top-performing SyN diffeomorphic registration (Avants et al., 2011a; Klein et al., 2010). These label sets were synthesized into a final parcellation using joint label fusion (Wang et al., 2013). Volume was determined for each parcel using the intersection between the parcel created and prior driven gray matter cortical segmentation from the ANTs cortical thickness pipeline as described below. Density estimates were calculated within each parcel as described below. To avoid registration bias and maximize sensitivity to detect regional effects that can be impacted by registration error, a custom adolescent template and tissue priors were created using data from 140 PNC participants, balanced for age and sex. Structural images were then processed and registered to this custom template using the ANTs cortical thickness pipeline (Tustison et al., 2014). This procedure includes brain extraction, N4 bias field correction (Tustison et al., 2010), Atropos tissue segmentation (Avants et al., 2011b), and SyN diffeomorphic registration method (Avants et al., 2011a; Klein et al., 2010).
Finally, gray matter density was calculated using Atropos (Avants et al., 2011b), with an iterative segmentation procedure that is initialized using 3-class K-means segmentation. This procedure produces both a discrete 3-class hard segmentation as well as a probabilistic gray matter density map (soft segmentation) for each subject. Gray matter density (GMD) was calculated within the intersection of this 3-class segmentation and the subject’s volumetric parcellation (Gennatas et al., 2017). Images included in the final analysis passed a rigorous QA procedure as previously detailed (Rosen et al., 2018).
Diffusion (DTI)
Diffusion weighted imaging (DWI) scans for measuring water diffusion were obtained using a twice-refocused spin echo (TRSE) single-shot EPI sequence. The sequence employs a four lobed diffusion encoding gradient scheme combined with a 90-180-180 spin-echo sequence designed to minimize eddy-current artifacts. The sequence consisted of 64 diffusion-weighted directions with b = 1000 s/mm2, and 7 scans with b = 0 s/mm2.
Diffusion data were skull stripped by generating a brain mask for each subject by registering a binary mask of a standard image (FMRIB58_FA) to each subject’s brain using FLIRT (Jenkinson et al., 2002). When necessary, manual adjustments were made to this mask. Next, eddy currents and movement were estimated and corrected using FSL’s eddy tool (Andersson and Sotiropoulos, 2016; Graham et al., 2016; Roalf et al., 2016). Eddy improves upon FSL’s Diffusion Tool Box (Behrens et al., 2003) and eddy correct tool (Andersson and Sotiropoulos, 2016; Graham et al., 2016) by simultaneously modeling the effects of diffusion eddy current and head movement on DTI images, reducing the amount of resampling. The diffusion gradient vectors were rotated to adjust for motion using the 6-parameter motion output generated from eddy. Then, the B0 field map was estimated and distortion correction was applied to the DTI data using FSL’s FUGUE (Smith, 2002). Finally, the diffusion tensor was modeled and metrics (FA and MD) were estimated at each voxel using FSL’s DTIFIT.
Registration from native space to a template space was completed using DTI-TK (Zhang et al., 2014; Zhang et al., 2006). First, DTI output files from DTIFIT were converted to DTI-TK format. Next, a template was generated from the tensor volumes using 14 representative diffusion data sets that were considered “Excellent” from the PNC sample. One individual from each of the 14 ages (age range 8-21) was randomly selected. These 14 DTI volumes were averaged to create an initial template. Next, data from the 14 subjects were registered to this template in an iterative manner. Unlike standard intensity-based registration algorithms, this process utilizes the full tensor information in an attempt to best align the underlying white matter tracts using iterations of rigid, affine and diffeomorphic registration leading to the generation of a successively refined template. Ultimately, one high-resolution refined template was created and used for registration of the remaining diffusion datasets. All DTI maps were then registered (rigid, affine, diffeomorphic) to the high-resolution study-specific template using DTI-TK. Whole brain analysis was performed using a customized implementation of tract-based spatial statistics (TBSS) (Bach et al., 2014). FA and MD values were computed using a study specific white matter skeleton. Then, standard regions of interest (ROI; ICBM-JHU White Matter Tracts; Harvard-Oxford Atlas) were registered from MNI152 space to the study-specific template using ANTs registration (Avants et al., 2011). Mean diffusion metrics were extracted from these ROIs using FSL’s ‘fslmeants’. Images included in this final analysis had passed a stringent quality assessment procedure as previously detailed (Roalf et al., 2016).
Functional
Perfusion (ASL)
Brain perfusion was imaged using a pseudo continuous arterial spin labeling (pCASL) sequence (Wu et al., 2007). The sequence used a single-shot spin-echo EPI readout. Parallel acceleration (i.e. GRAPPA factor = 2) was used to reduce the minimum achievable echo time. The arterial spin labeling parameters were: label duration = 1500 ms, post-label delay = 1200 ms, labeling plane = 90 mm inferior to the center slice. The sequence alternated between label and control acquisitions for a total of 80 acquired volumes (40 labels and 40 controls), the first being a label.
ASL data were pre-processed using standard tools included with FSL (Jenkinson et al., 2012). Following distortion correction using the B0 map with FUGUE, the first four image pairs were removed, the time series was realigned in MCFLIRT (Jenkinson et al., 2002), the skull was removed with BET (Smith, 2002), and the image was smoothed at 6mm FWHM using SUSAN (Smith and Brady, 1997). CBF was quantified from control-label pairs using ASL Toolbox (Wang et al., 2008). As prior (Satterthwaite et al., 2014), the T1 relaxation parameter was modeled on an age- and sex-specific basis (Wu et al., 2010). This model accounts for the fact that T1 relaxation time differs according to age and sex, and has been shown to enhance the accuracy and reliability of results in developmental samples (Jain et al., 2012). The CBF image was co-registered to the T1 image using boundary-based registration (Greve and Fischl, 2009), and regional CBF values were averaged within each parcel. Subjects included in this analysis had low motion as measured by mean relative framewise displacement less than 2.5 mm.
Resting-state BOLD
Resting-state BOLD scans were acquired with a single-shot, interleaved multi-slice, gradient-echo, echo planar imaging (GE-EPI) sequence. In order to reach steady-state signal levels, the sequence performed two additional dummy scans at the start. The imaging volume was sufficient to cover the entire cerebrum of all subjects, starting superiorly at the apex. In some subjects, the inferior portion of the cerebellum could not be completely included within the imaging volume. The selection of imaging parameters was driven by the goal of achieving whole brain coverage with acceptable image repetition time (i.e. TR = 3000 ms). A voxel resolution of 3 × 3× 3 mm with 46 slices was the highest obtainable resolution that satisfied those constraints. During the resting-state scan, a fixation cross was displayed as images were acquired. Participants were instructed to stay awake, keep their eyes open, fixate on the displayed crosshair, and remain still. Resting state scan duration was 6.2 min.
Task-free functional images were processed using a top-performing pipeline for removal of motion-related artifact (Ciric et al., 2017). Preprocessing steps included (1) correction for distortions induced by magnetic field inhomogeneities using FSL’s FUGUE utility, (2) removal of the 4 initial volumes of each acquisition, (3) realignment of all volumes to a selected reference volume using MCFLIRT (Jenkinson et al., 2002) (4) removal of and interpolation over intensity outliers in each voxel’s time series using AFNI’s 3DDESPIKE utility, (5) demeaning and removal of any linear or quadratic trends, and (6) co-registration of functional data to the high-resolution structural image using boundary-based registration (Greve and Fischl, 2009). The artifactual variance in the data was modelled using a total of 36 parameters, including the 6 framewise estimates of motion, the mean signal extracted from eroded white matter and cerebrospinal fluid compartments, the mean signal extracted from the entire brain, the derivatives of each of these 9 parameters, and quadratic terms of each of the 9 parameters and their derivatives. Both the BOLD-weighted time series and the artifactual model time series were temporally filtered using a first-order Butterworth filter with a passband between 0.01 and 0.08 Hz.
ReHo
Voxelwise regional homogeneity (ReHo; Zang et al., 2004) is equivalent to Kendall’s coefficient of concordance computed over the timeseries in each voxel’s local neighborhood. ReHo can thus be used as an estimate of the homogeneity of each neighborhood’s activation pattern. Because spatial smoothing intrinsically elevates ReHo estimates by elevating spatial autocorrelation, Kendall’s W was computed only on unsmoothed data. Each voxel’s neighborhood was defined to include the 26 voxels adjoining its faces, edges, and vertices. The voxelwise homogeneity map was subsequently smoothed using a Gaussian kernel with FWHM of 6mm in SUSAN to improve the signal-to-noise ratio (Smith and Brady, 1997). Finally regional ReHo values were then averaged across the anatomically derived subject specific segmentation. Participants included in this analysis had low motion with mean relative frame wise displacement less than 2.5 mm.
ALFF
Functional connectivity among brain regions is primarily attributable to correlations among low-frequency fluctuations in regional activation patterns. The voxelwise amplitude of low-frequency fluctuations (ALFF; Zang et al., 2007) was computed as the sum (discretised integral) over frequency bins in the low-frequency (0.01-0.08Hz) band of the voxelwise power spectrum, computed using a Fourier transform of the time-domain of the voxelwise signal. ALFF was calculated on data smoothed in SUSAN using a Gaussian-weighted kernel with 6mm FWHM(Smith and Brady, 1997).
Brain and Behavior Associations
Performance prediction
Performance on the CNB (Moore et al., 2015) was then predicted using each aforementioned modality. Subjects included in this analysis passed each modality specific inclusion criteria and received the cognitive battery within a year of the neuroimaging. Age effects were removed from the data by regressing age, the quadratic, and cubic effect of age on each region; finally bilateral regions were averaged across hemispheres. Models predicting performance using all regional values and a global summary metric from each modality were built in a 10 fold cross-validated fashion within each sex. Stratified samples were built using the ‘createFolds’ function from the ‘caret’ (Kuhn et al., 2016) package in R. Within each training fold a ridge regression model was built modeling the performance as predicted by every regional modality estimate and global summary metric using glmnet function in the ‘glmnet’ (Friedman et al., 2017) package. The optimal lambda was tuned within each training fold by finding the lambda value with the lowest cross-validated error within the training fold. This procedure was performed 1000 times in order to obtain confidence intervals and means of cross-validated R2 values. Finally, statistical significance was determined by comparing these point estimates to empirical null distributions of cross-validated R2 values under the null hypothesis (computed by running the models 1000 times on randomly permuted cognitive estimates).
RESULTS
We used two complementary approaches to examine the effects of variability in brain structural and functional parameters and our performance parameter: hypothesis-testing and data-driven, both incorporating procedures to contain type I error. Analyses were conducted using the open source R platform (Version 3.5, R Core Team, 2015).
Hypothesis-testing results
For the hypothesis-testing approach, analyses were conducted at the global and regional levels by fitting generalized estimating equations (GEEs) with unstructured working correlation structure. GEE models are an extension of generalized linear models that estimate dependence among repeated measures by a user-specified working correlation matrix which allows for correlations in the dependent variable over time or across observations. Five nested forms of GEE models were fit and evaluated as shown below. The null model (Model 1) evaluated the association of the demographic variables, age and sex, on brain parameters. A second model added the performance term to evaluate the association between performance and brain parameters adjusted for demographic variables (Model 2). To evaluate if association of performance and brain parameters differed by sex or by age, interaction terms were added as shown in Model 3 and Model 4 respectively. Finally, to evaluate if association of performance and brain differed by both age and sex, Model 5 included all main effects and all possible interactions. Model performance was compared using a Wald test. A squared age term was included to capture non-linear effects of age. Models were fit at all anatomical specificity levels.
Model 1: Null model: Sex and Age Associations
Model 2: Model Associations of Performance and Brain
Model 3: Model with Sex modifying the associations of Performance and Brain
Model 4: Model with Age modifying the associations of Performance and Brain
Model 5: Model with Age and Sex modifying the associations of Performance and Brain
Global values
The global measures included estimated total brain volume, average whole-brain values of GMD, MD, CBF, ALFF and ReHo. The global measures were scaled within modality prior to fitting the GEE model. This analysis indicated that performance was significantly associated with whole brain global measures across modalities and this association differed by sex and age (GEE, Wald χ2 = 59.51, df=30, p =0.001). Age interactions were examined by dividing the sample into children (ages less than 12), adolescents (ages 12-17) and young adults (ages 18 and older). Performance interactions were examined by dividing the sample into high, middle, and low performance bins based on tertile splits. This interaction (Figure 1, top panel) indicated that while high performers had increased volume and GMD compared to medium and low performance groups, they had lower MD and CBF with no differences in ALFF and ReHo. As can be seen in Figure 1 (bottom panel), effect sizes of differences between performance groups became stronger from childhood to adolescence to adulthood for volume and GMD. For MD and CBF this trend was seen in males, but not in females. For ALFF and ReHo the effect size of increased values in the high performance group reached .5 SDs in the oldest male group, but diminished and even reversed for ALFF in the oldest female group.
Regional analyses
To examine regional specificity and compare regional differences of interactions with performance, GEE models were fit within each modality. To minimize type I error given the large number of regions (up to 128 regions per modality), we aggregated them into 8 neuroanatomic Sections: cerebellum, baso-striatal, limbic, frontal, temporal, parietal, occipital, WM. Section volumes were derived using the sum of all regions involved; for all other brain measures a volume-weighted mean of the regions involved was calculated at the subject-level for each brain section. Exploratory analyses to understand neurodevelopmental changes were conducted for those brain sections and in those modalities that showed significant interactions with performance. Significant effects and interactions of performance were elucidated by charting the brain parameter profiles of effect sizes (Cohen D) for the differences between the high and low performance groups. The regional maps of effect sizes contrasting high and low performers to the middle group on each parameter are shown in Figure 2.
For volume, there was a significant performance by brain section interaction (GEE, Wald χ2 = 191.24, df=8, p < 0.0001) justifying exploratory analyses within each brain section. The exploratory analyses showed that performance was significantly associated with brain parameters for all brain sections (Supplementary Table S3). Higher brain volume was associated with better performance in specific brain regions in all sections (all Performance x Region interactions highly significant within each section, Table S3). Notably, no higher-order interactions with sex or age were significant, indicating relative uniformity of effects in males and females across youth. To elucidate the performance x region interactions, we therefore examined age-adjusted differences between high and low performance groups. As can be seen in Figure 3 (top panel), effect sizes showing higher volumes in high performance compared to low performance groups ranged from small (.1 to .2 SD difference) to large (> .8 SD difference). The largest effect sizes for cerebellum were in the exterior cerebellum and in the cerebellar vermal lobules VII-X (.6 and .8 SDs, respectively), the largest for baso-striatal section was for the thalamus (.6 and .8 SDs for males and females, respectively) and the largest effect size of limbic regions was in the hippocampus (same order of magnitude). Frontal regions showed fairly uniform effect sizes, the largest being for posterior orbital gyrus (>.6 SDs), while temporal regions showed large variability, with the strongest effect sizes for temporal pole, fusiform gyrus, inferior temporal gyrus and mid-temporal gyrus (.6 to .8 SDs). Parietal and occipital lobes also showed variability of effect sizes, the largest effect for parietal was in the precuneus (>.6 SDs), while in occipital the largest effect sizes were in occipital fusiform gyrus, lingual gyrus, inferior occipital gyrus and occipital pole (.7 to .8 SDs). For WM, effects were generally larger for females than males and were the largest for temporal and parietal WM.
For GMD, there was a significant performance by brain section interaction that differed by age and sex (GEE, Wald χ2 = 61.15, df=30, p =0.0007) justifying exploratory analyses within each brain section. The exploratory analyses showed that performance was significantly associated with brain parameters for all sections. (Supplementary Table S3). In addition, this association differed significantly by sex and by age within the frontal lobe, and by age for temporal, parietal and occipital lobes (Figure 3, middle panel). High performers had higher GMD across age and sex groups, but in males the effect sizes increased in most regions from childhood to adolescence to young adulthood, while in females they remained generally stable across age bins.
For MD, there was a significant performance by brain section interaction (GEE, Wald χ2 = 108.82, df=8, p < 0.0001) justifying exploratory analyses within each brain section. The exploratory MD analyses showed significant association of performance and brain region for all sections except for baso-striatal (Table S3, Figure 6). In addition, for all brain sections except WM, the association differed by age and sex. Overall, except for some baso-striatal and limbic ROIs showing the opposite effects, across cortical regions, both male and female high-performing groups had lower MD than the low-performing groups. The effect sizes differentiating high and low performers ranged widely from large (>.8) in some orbital frontal regions to moderate in specific cortical regions. This pattern was similar in males and females, displaying large convergence across effect sizes. The effect sizes increased from childhood to adolescence to young adulthood, especially in frontal regions (Figure 3, bottom panel).
For CBF, there was a significant performance by brain section interaction that also differed by age and sex (GEE, Wald χ2 = 67.15, df=35, p =0.0.0008) justifying exploratory analyses within each brain section. The exploratory analyses for each brain section showed significant Performance*Region*Age*sex interactions for limbic, frontal and temporal regions, as well as white matter (Table S3). The interactions for these regions (Figure 4, top panel) indicated that effect sizes of lower CBF associated with high performance were stronger for males than females and showed greater increase from childhood to young adulthood.
For ALFF, there was a significant performance by brain section interaction that also differed by age and sex (GEE, Wald χ2 = 66.42, df=35, p =0.0011) justifying exploratory analyses within each brain section. The exploratory analyses revealed significant four way interactions of Performance*Region*Age*Sex, in all but cerebellum (Table S3), which did show a marginal Performance*Region interaction (p=0.0314). These interactions are illustrated in Figure 4, middle panel. For males, higher ALFF in the high performing groups develops from childhood through adolescence to adulthood. Indeed, high performing children have lower ALFF than their low performing counterparts, but the adult groups are characterized by robust effect sizes, reaching and exceeding .8 SDs in some frontal (medial superior frontal and opercular inferior frontal gyrus) and parietal (precuneus) regions. In females, by contrast, effect sizes are considerably smaller (+/− .4 SDs) and mature in the opposite direction to that of males, with adult high performers having lower ALFF in most regions.
For ReHo, there was a significant performance by brain section interaction (GEE, Wald χ2 = 73.41, df=35, p =0.0002). This analysis also yielded significant interactions with sex and age. The exploratory analyses within a brain section revealed significant 4 way interactions of Performance*Region*Age*Sex, in all brain sections except occipital (Table S3). The occipital regions did show a significant Performance by Region interaction. These interactions are illustrated in Figure 4, bottom panel. As with ALFF, in males the effect sizes became more positive from childhood to adulthood and exceed .7 SDs in frontal (superior frontal and orbital inferior frontal gyrus) and a parietal (precuneus) region. In females, as for ALFF, effect sizes were smaller and became more negative in adults compared to children in most regions.
Data-driven analyses
In the data-driven analysis, we applied ridge regression (Hoerl & Kennard, 1970) to the multimodal brain parameters with all regions allowed to compete for explaining variance in the cognitive performance measure. We employed ridge regression in a cross-validated manner. Data were randomly split into training (90%) and testing (10%) sets; models were built in the training set, which includes optimizing the penalty as well as determining the model coefficients; and this trained model was applied to the testing set, resulting in an out-of-sample measure of percent variance explained (R2). The above steps were repeated 1000 times, and the R2 values reported below are the means of these 1000 splits. The same procedure was repeated on the same data, randomly permuting the performance parameter among subjects to generate a null distribution. Notably, all real distributions were significantly greater than the null distributions. As can be seen in Figure 5, volume alone explained the largest portion of variance in performance, producing a cross-validated R2 that exceeds 17% of variance in both males and females. Other parameters explained much smaller portions of the variance, with GMD, MD and CBF explaining similar proportions of variance in males and females, in the range of 4 to 12%, while ALFF and ReHo explained larger proportion of variance in males than in females. Notably, when all modalities were entered, the ridge regression produced cross-validated R2 of 20% for both males and females.
Integration of hypothesis and data-driven results
To examine the relation between the effect sizes produced by contrasting the high and low performance groups and the results of the data-driven ridge regressions, we correlated the effect sizes with the ridge regression coefficients. Ridge coefficients were obtained by training a model in the entire dataset. The results (Figure 6) indicate that the overall correlation was moderate (0.27 in males and 0.28 in females), but this masked high within-modality correlations (ranging from .55 to .79 in males and from .62 to .79 in females). The attenuated overall correlations seems to reflect an amalgamation “Yule-Simpson” effect (Yule 1903, Blyth 1972), where the overall lower correlation masks underlying high correlations within each brain parameter. Thus, within each parameter (volume, GMD, etc.), the regions selected as most important by the ridge regression also had larger effect sizes when contrasting high and low performance groups.
DISCUSSION
Our results offer fresh insights regarding how cognition is related to multimodal parameters of brain structure and function. Volume was far and away the brain parameter most strongly associated with cognitive performance, confirming earlier findings associating higher brain volumes with better cognitive abilities (Witelson, Beresh, & Kigar, 2005; Gignac and Bates, 2017; Nave et al., 2018; Pietschnig et al., 2015). In our sample, effect sizes separating high from low performance groups were moderate to large, and cross-validated R2s for predicting performance based on volume alone exceeded .17 for both males and females. This estimate of explained variance is at the middle-range of estimates from prior studies, which vary from 3% to >30%. We also established that high GMD relates to better performance, although the effect sizes and variance explained were more modest (around 7% for males and 12% for females). Diffusion-weighted parameters showed that high performance was associated with lower MD values. While these effects were regionally specific, they reached moderate to large effect sizes (.7 to .8 SDs) in some frontal and temporal regions, and overall explained over 10% of performance variance in both males and females. Thus, brains of high performing individuals are anatomically characterized by larger volumes and denser and less coarsely organized gray matter. The neuroanatomic performance associations are stable across the current age range and in both males and females for volume, or become more pronounced across developmental epochs of childhood, adolescence and young adulthood for GMD and MD.
We also uncovered how physiologic parameters relate to performance. Resting CBF values were negatively associated with performance, showing small effect sizes (.2 to .4 SDs) and explaining about 3% and 5% of performance variance in males and females, respectively. Notably, although effect sizes of lower CBF in high performers reached maturity levels earlier in females than in males (by adolescence), they ended up considerably larger in the young adult males. While we had no prior studies on which to base a hypothesis on whether high performance in this developmental sample should be associated with higher or lower CBF, it is notable that CBF was measured at a resting state, characterized as the “default mode” (Raichle et al., 2001) condition. Our finding that lower resting-state CBF is associated with better overall performance is consistent with reports that deactivation of the default-mode network during task performance is as predictive of performance as activation of task-related regions (Satterthwaite et al., 2013). A lower “idling rate” may be conducive to better performance by permitting activation when the individual is faced with a task while preserving energy in the absence of a task.
ALFF and ReHo derived from resting state fMRI, showed more complex associations with performance, which differed in males and females. Males showed higher values associated with good performance, a relationship that strengthened from childhood to adolescence to early adulthood. In contrast, in females these effects were smaller, and in adults, high performers had even lower values in most regions. It is unclear why higher or lower amplitude of low frequency fluctuations or regional homogeneity values would relate to cognitive performance, and our findings should motivate further study with complementary methodologies and analytic approaches.
The results overall indicate that optimal cognitive performance is supported by fine-tuning of regionally distributed anatomic and physiologic parameters. Notably, these relationships between anatomic brain parameters and performance are generally stable within this wide developmental age range of 8 to 22 years, notwithstanding the major effects of age on all parameters involved. For the anatomic parameters, the relation with performance is stable or becomes more pronounced with maturation. The physiologic parameters are also associated with performance, but these associations evolve during this age span and differ for males and females.
The regional distribution of performance group differences varied by modality, but across modalities it was more pronounced for some regions. The regions that were most strongly modulated in relation to performance included few subcortical regions, specifically thalamus, hippocampus, parahippocampal gyrus and entorhinal cortex. Subcortical regions have been implicated in higher-order cognitive function and memory (Koziol et al., 2014; Münte et el., 2008; Wolff & Vann, 2019). Effect sizes were generally strongest for cortical GM regions, and these included frontal (orbital and midfrontal) temporal (midtemporal), Parietal (precuneus and superior parietal) and occipital cortex (fusiform and lingual). All these regions have been implicated as important nodes in complex cognition (Gur et al., 1999; Dubois, Galdi, Paul, Adolphs, 2018; Nave, Jung, Karlsson, Kable, Koellinger, 2019).
Although the main findings were robust across age groups, sex differences modulated observed effects in informative ways. The sex differences overall indicated complementary mechanisms in males and females that serve to compensate for sex differences in brain parameters. Sex differences in brain-behavior associations have been well documented (e.g., Faraone and Tsuang, 2001, Gur et al., 1982, 1999, 2000; Jazin and Cahill, 2010, Ragland et al., 2000, Raznahan et al., 2011, Satterthwaite et al., 2015). Thus, lower volume in females is compensated for by higher GMD and performance is further modulated by MD, CBF, ALFF and ReHo, which may together account for equal cognitive performance. Furthermore, age-related differences in this developmental cohort were smaller for females than for males, indicating that females reach adult differences earlier. Indeed, when we applied age prediction models based on volume and MD, chronological age was over-predicted by brain age in the younger age range (8 to 16 years) in females compared to males, and this effect reversed toward young adulthood, where females had “younger looking” brains (Erus et al., 2014). This sex difference in adult samples seems to extend across metabolic parameters and the age range of 20 to 80 years - females consistently show younger brain age than male counterparts with the same chronological age (Goyal et al., 2019). We might speculate that earlier stabilization of metabolic parameters in females helps sustain brain integrity throughout the adult lifespan. Such a complementarity between the sexes might have enhanced survival and reproduction in humans’ environment of evolutionary adaptedness (Barkow, Cosmides, & Tooby, 1995). These sex differences could further reflect complementary reliance on different aspects of brain structure and function to optimize cognitive performance.
The data-driven analyses indicated that brain parameters of structure and function can predict a fairly high portion of the variance in cognitive performance. While the strongest predictor was volume, which alone accounted for about 17% of the variance, even the parameter with lowest predictive value accounted for ∼5% of the variance. Using a combined set of parameters we were able to explain nearly 20% of the variance (cross-validated) in both males and females. Most previous studies to which we can compare our results examined volume. Our R2 values are considerably higher than those reported for volumes by Nave et al., (2018), which estimate that volume explains slightly over 3% of the variance in cognitive performance. Possibly the more extensive battery on which our performance measure was based, as well as the use of the same scanner, could have eliminated some sources of noise in estimating the dependent measures. Overall, considering the inherent error in all our measurements, these results indicate substantial coupling between cognitive performance and parameters of brain structure and function. Additional variance can be explained by more refined cognitive parameters that relate to specific brain systems, by examination of brain networks, or by measures such as regional activation concurrent with performance (Roalf et al., 2014). Indeed, earlier work has shown that task-activated fMRI is a better predictor of performance than the resting-state measures examined here (Greene, Gao, Scheinost, & Constable, 2018; Yoo et al., 2018).
Several limitations of this study are noteworthy. The age range of the sample, 8 to 22 years, limits generalizability to other ages. Within this age range, in which changes occur in all brain modalities examined, age did not generally affect the performance-related differences. Another limitation of the study was the focus on a single parameter of cognitive capacity. This focus was necessitated by the complexity of probing multiple brain regions across modalities and accounting for age effects and sex differences. The measure selected is the closest proxy for “IQ”, which was used in other studies and thus improves comparability of results. Future analyses can focus on other performance domains, such as episodic memory and social cognition, and more specific aspects of performance such as accuracy compared to speed. The study is also limited by analyzing data across the entire sample, which is quite heterogeneous and, while ascertained through general pediatric services and not psychiatric services, still included individuals with significant psychopathology (Calkins et al., 2015) and adverse life events (Barzilay et al., 2018), and from diverse sociodemographic, ethnic and racial backgrounds. Perhaps stronger and more coherent effects could be seen if we limited the analyses to the subsample of typically developing youth without any significant disorder or to more homogeneous populations. We believe that while such analyses have merit and could reveal effects of different disorders on the observed relationships, the heterogeneity and diversity of our sample enhance generalizability of the reported results. Finally, our analyses examined regional parameters of brain structure and function and related them to individual differences in behavioral measures of cognitive performance taken within the same timeframe, but not contemporaneously in the scanner. More variance in behavioral measures could be linked to functional brain parameters acquired contemporaneously (Roalf et al., 2014).
Notwithstanding its limitations, the present study provides some “benchmarks” for assessing relations among brain parameters and performance. The results can guide hypotheses on how brain structure and function relate to individual differences in cognitive capacity, and offer the ability to gauge the relevance to cognitive performance of group differences or changes in brain parameters. Furthermore, acquisition of each parameter is costly in time and data management resources, and our study can inform design of future large-scale neuroimaging studies based on the relevance of associating acquired brain parameters with cognitive performance. Our finding of lower resting CBF in high performers is worthy of special emphasis since, unlike the structural parameters of volume, GMD and MD, it relates to brain function. Uncovering a physiologic index associated with individual differences in cognitive performance has important implications for developing a scientific basis for social and medical prevention, education and intervention strategies. Anatomy is unlikely to be readily affected by behavioral or pharmacologic treatment. By contrast, physiologic states such as measured by CBF, ALFF and ReHo, can be changed within seconds, and it is easier to conceive of treatments that can affect resting-state CBF for sustainable durations. Notably, current methods for rehabilitation of brain dysfunction emphasize activation of task-related brain systems. Our findings that lower resting CBF and increased resting state connectivity are associated with better performance suggest that emphasis should also be placed on training in deactivation of task-relevant regions in the absence of a task. Indeed, our results may offer a scientific basis for the benefits of procedures such as meditation, which emphasize relaxation associated with the absence of goal-oriented behavior and results in reduced default-mode activity (Brewer et al., 2011; Hasenkamp and Barsalou, 2012). That both greater volume and gray matter density of brain and lower basal metabolic rate are associated with cognitive abilities is consistent with preservation of tissue at low energy consumption as the “holy grail” for optimal organ function.
AUTHOR CONTRIBUTIONS
R.E.G. and R.C.G. conceived the project, designed the study, guided data analysis, interpreted the results and wrote the manuscript. M.A.E., R.V. and J.A.D. designed the imaging protocol and participated in data analysis, R.C.G., T.M.M., A.F.G.R., A.P. and K.R. analyzed data, made figures and wrote sections of the manuscript. T.D.S., D.R.R., D.H.W., R.V., C.D. and E.D.G. guided image processing and interpretation of results. W.B.B. and R.T.S. guided statistical analysis. All authors reviewed and contributed to the write-up of the manuscript.
DECLARATION OF INTERESTS
All authors declare no competing interests.
ACKNOWLEDGMENTS
We thank the participants of the Philadelphia Neurodevelopmental Cohort and the members of the Recruitment, Assessment, Neuroimaging and Data Teams whose contributions made this project possible. This work was supported by NIH grant MH107235, MH089983, MH096891, MHP50MH06891, R01MH113550, R01MH112847, the Dowshen Neuroscience fund, and the Lifespan Brain Institute of Children’s Hospital of Philadelphia and Penn Medicine, University of Pennsylvania.