This study investigates the creation of polygenic scores (PGS)s for human population research. PGSs are a linear, usually weighted, combination of risk alleles that estimate the cumulative genetic risk of an individual for a particular trait. While conceptually simple, there are numerous ways to estimate PGSs, not all achieving the same end goals. In this paper, we systematically investigate the impact of four key decisions in the building of PGSs from published genome-wide association meta-analysis results: 1) whether to use single nucleotide polymorphisms (SNPs) assessed by imputation, 2) criteria for selecting which SNPs to include in the score, 3) whether to account for linkage disequilibrium (LD), and 4) if accounting for LD, which type of method best captures the correlation structure among SNPs (i.e. clumping vs. pruning). Using the Health and Retirement Study (HRS), a nationally representative, population-based longitudinal panel study of Americans over the age of 50, we examine the predictive ability as well as the variability and co-variability in PGSs arising from these different estimation approaches. We examine four traits with large published and replicated genome-wide association studies (height, body mass index, educational attainment, and depression). Our central finding demonstrates PGSs that include all available SNPs either explain the most amount of variation in an outcome or are not significantly different than the PGSs that does. Thus, for reproducibility through rigor and transparency, we recommend that researchers include a PGS with all available SNPs as a reference, and provide substantial justification for using alternative methods.