Abstract
Despite a growing body of research suggesting that task-based functional magnetic resonance imaging (fMRI) studies often suffer from a lack of statistical power due to too-small samples, the proliferation of such underpowered studies continues unabated. Using large independent samples across four distinct tasks, we demonstrate the impact of sample size on reproducibility, assessed at different levels of analysis relevant to fMRI researchers. We find that typical sample sizes produce results that have a low degree of reproducibility, and even samples much larger than typical (e.g., N = 100) produce results that are far from perfectly reproducible. Thus, our results join the existing line of work advocating for larger sample sizes. Moreover, because we test sample sizes over a fairly large range and use intuitive metrics of reproducibility, our hope is that our results help catalyze a major shift in how task-based fMRI research is carried out across the entire field.