The role of DNA methylation in development, divergence, and the response to environmental stimuli is of substantial interest in ecology and evolutionary biology. Measuring genome-wide DNA methylation is increasingly feasible using sodium bisulfite sequencing. Here, we analyze simulated and published data sets to demonstrate how effect size, kinship/population structure, taxonomic differences, and cell type heterogeneity influence the power to detect differential methylation in bisulfite sequencing data sets. Our results reveal that the effect sizes typical of evolutionary and ecological studies are modest, and will thus require data sets larger than those currently in common use. Additionally, our findings emphasize that statistical approaches that ignore the properties of bisulfite sequencing data (e.g., its count-based nature) or key sources of variance in natural populations (e.g., population structure or cell type heterogeneity) often produce false negatives or false positives, thus leading to incorrect biological conclusions. Finally, we provide recommendations for handling common issues that arise in bisulfite sequencing analyses and a freely available R Shiny application for simulating and performing power analyses on bisulfite sequencing data. This app, available at www.tung-lab.org/protocols-and-software.html, allows users to explore the effects of sequencing depth, sample size, population structure, and expected effect size, tailored to their own system.