Abstract
Introduction Large datasets, consisting of hundreds or thousands of subjects, are becoming the new data standard within the neuroimaging community. While big data creates numerous benefits, such as detecting smaller effects, many of these big datasets have focused on non-clinical populations. The heterogeneity of clinical populations makes creating datasets of equal size and quality more challenging. There is a need for methods to connect these robust large datasets with the carefully curated clinical datasets collected over the past decades.
Methods In this study, resting-state fMRI data from the Adolescent Brain Cognitive Development study (N=1509) and the Human Connectome Project (N=910) is used to discover generalizable brain features for use in an out-of-sample (N=121) multivariate predictive model to classify young (3-10yrs) children who stutter from fluent peers.
Results Accuracy up to 72% classification is achieved using 10-fold cross validation. This study suggests that big data has the potential to yield generalizable biomarkers that are clinically meaningful. Specifically, this is the first study to demonstrate that big data-derived brain features can differentiate children who stutter from their fluent peers and provide novel information on brain networks relevant to stuttering pathophysiology.
Discussion The results provide a significant expansion to previous understanding of the neural bases of stuttering. In addition to auditory, somatomotor, and subcortical networks, the big data-based models highlight the importance of considering large scale brain networks supporting error sensitivity, attention, cognitive control, and emotion regulation/self-inspection in the neural bases of stuttering.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Updated figures.
1 The Stuttering Severity Instrument (SSI-4) was used to examine frequency and duration of disfluencies occurring in the speech sample acquired from each child who stutters. The SSI composite score incorporates frequency and duration of stuttered speech, as well as any physical concomitants associated with stuttering (Riley & Bakker, 2009). To be classified as a child who stutters, they needed to score in the very mild or higher range on the SSI composite score. For borderline cases, parent’s expressed concern of stuttering and clinician (certified Speech-Language Pathologist) impression confirming stuttering status were considered in making the determination of stuttering status.
2 Fluid Intelligence is a Cognition Composite Score that includes DCCS, Flanker, Picture Sequence Memory, List Sorting, and Pattern Comparison measures.
Abbreviations
- ABCD
- Adolescent Brain Cognitive Development study
- ADHD
- Attention Deficit Hyperactivity Disorder
- AROMA
- Automated Removal of Motion Artifact
- AUC
- Area under the Curve
- BBS
- Brain Basis Set
- BOLD
- Blood Oxygen Level Dependent
- CO
- Cingular-Opercular network
- CER
- Cerebellum network
- CSF
- Cerebral Spinal Fluid
- CWS
- Children Who Stutter
- DAN
- Dorsal Attention network
- DMN
- Default Mode network
- EPI
- Echo Planar Imaging
- FIX
- FIMRIB’s ICA-based Xnoiseifier
- FD
- Framewise Displacement
- FSL
- FIMRIB Software Library
- fMRI
- Functional Magnetic Resonance Imaging
- HCP
- Human Connectome Project
- ICA
- Independent Component Analysis
- MNI
- Montreal Neurological Institute
- M R
- Memory Retrieval network
- PCA
- Principal Component Analysis
- ROC
- Receiver Operator Characteristic
- ROI
- Region of Interest
- rsfMRI
- Resting-State Functional Magnetic Resonance Imaging
- SCPT
- Short Continuous Performance Task
- SMN
- Somatomotor network
- SPM
- Statistical Parametric Mapping
- SSI
- Stuttering Severity Instrument
- VAN
- Ventral Attention network
- VIS
- Visual network