A polygenic risk score for breast cancer in U.S. Latinas and Latin-American women

Yiwey Shieh; Laura Fejerman; Paul C. Lott; Katie Marker; Sarah D. Sawyer; Donglei Hu; Scott Huntsman; Javier Torres; Magdalena Echeverry; Mabel E. Bohorquez; Juan Carlos Martínez-Chéquer; Guadalupe Polanco-Echeverry; Ana P. Estrada-Florez; Christopher A. Haiman; Esther M. John; Lawrence H. Kushi; Gabriela Torres-Mejía; Tatianna Vidaurre; Jeffrey N. Weitzel; Sandro Casavilca Zambrano; Luis G. Carvajal-Carmona; Elad Ziv; Susan L. Neuhausen

doi:10.1101/598730

Abstract

Background Over 180 single nucleotide polymorphisms (SNPs) associated with breast cancer susceptibility have been identified; these SNPs can be combined into polygenic risk scores (PRS) to predict breast cancer risk. Since most SNPs were identified in predominantly European populations, little is known about the performance of PRS in non-Europeans. We tested the performance of a 180-SNP PRS in Latinas, a large ethnic group with variable levels of Indigenous American, European, and African ancestry.

Methods We conducted a pooled case-control analysis of U.S. Latinas and Latin-American women (4,658 cases, 7,629 controls). We constructed a 180-SNP PRS consisting of SNPs associated with breast cancer risk (p < 5 x 10⁻⁸). We evaluated the association between the PRS and breast cancer risk using multivariable logistic regression and assessed discrimination using area under the receiver operating characteristic curve (AUROC). We also assessed PRS performance across quartiles of Indigenous American genetic ancestry.

Results Of 180 SNPs tested, 140 showed directionally consistent associations compared with European populations, and 43 were nominally significant (p < 0.05). The PRS was associated with breast cancer risk, with an odds ratio (OR) per standard deviation increment of 1.58 (95% CI 1.52-1.64) and AUCROC of 0.63 (95% CI 0.62 to 0.64). The discrimination of the PRS was similar between the top and bottom quartiles of Indigenous American ancestry.

Conclusions The 180-SNP PRS predicts breast cancer risk in Latinas, with similar performance as reported for Europeans. The performance of the PRS did not vary substantially according to Indigenous American ancestry.

Introduction

Over 180 single nucleotide polymorphisms (SNPs) associated with breast cancer susceptibility have been discovered in genome-wide association studies (GWAS) [1–4]. Though each SNP has a modest effect, multiple SNPs can be combined into a polygenic risk score (PRS) [5]. PRS has emerged as a promising tool for breast cancer risk stratification. The risk associated with having a PRS in the upper 20-25^th percentile is similar to that of strong clinical risk factors such as having extremely dense breasts [6], and adding PRS to risk models improves discrimination and reclassification [6–8]. Ongoing clinical trials are studying the use of PRS to personalize breast cancer screening and prevention [9]. Some commercial genetic testing laboratories are already returning PRS results to those who tested negative for pathogenic moderate- or high-penetrance mutations [10, 11].

A major barrier to the widespread use of PRS is the relative paucity of knowledge regarding its performance in non-European populations. To date, SNP discovery has overwhelmingly occurred in European populations [12]. However, the effect sizes, allele frequencies, and linkage disequilibrium patterns of SNPs vary by ancestry [12, 13]. Though relatively few studies have examined PRS performance in non-Europeans, they suggest that PRS constructed using European SNP summary statistics (effect size, allele frequency) typically perform worse in non-European populations [14, 15]. Currently, commercial testing laboratories only provide breast cancer PRS results to women of European ancestry [10, 11].

Disparities in the use and performance of PRS could especially affect Latinas. Latino/Latinas comprise the largest minority group in the U.S., representing 17.8% of the population in 2016 [16]. This diverse group includes genetically admixed individuals who have varying degrees of Indigenous American, European, African, and Asian ancestry [17–19]. We previously identified SNPs in the 6q25 locus associated with breast cancer risk exclusively in Latinas [20]. Most SNPs discovered in European populations display directional consistency in Latinas, with some also being nominally significant [20, 21]. One previous study assessed the performance of a breast cancer PRS in Latinas, finding that a 71-SNP PRS had worse prediction in Latinas as comparable PRS in Europeans [5, 15]. However, it included only 147 cases and did not account for genetic ancestry [15].

We sought to test the performance of PRS in U.S. Latinas and Latin American women (collectively referred to hereafter as Latinas). To that end, we conducted a pooled case-control analysis of 8 studies comprising 13,631 Latinas. We examined the predictive performance of a 71-SNP and a 180-SNP PRS, and whether PRS performance varies by genetic ancestry.

Methods

Participants

Our analysis included 13,631 self-identified Latinas, of whom 5,697 women with invasive breast cancer were considered cases and 7,934 without breast cancer were controls. Participants came from 8 studies (Tables 1 and S1). Recruitment details and patient characteristics have been previously reported for each study except for PGEN-BC. Studies are briefly described below and in more detail in the Supplement.

The San Francisco Bay Area Breast Cancer Study (SFBCS) plus the Northern California Breast Cancer Family Registry (NC-BCFR), a population-based case-control study recruiting from the San Francisco Bay Area [22, 23].
The Kaiser Permanente Research Project on Genes, Environment, and Health (RPGEH), a biobank recruiting from Northern California and the Pacific Northwest [24].
The Multiethnic Cohort (MEC) study, a prospective cohort study recruiting from Southern California and Hawaii [25].
The Cancer de Mama (CAMA) study, a population-based case-control study in Mexico [26].
The Post-Columbian Study of Environmental and Heritable Causes of Breast Cancer (COLUMBUS-Colombia), a population-based case-control study in southern Colombia [20].
The Post-Columbian Study of Environmental and Heritable Causes of Breast Cancer (COLUMBUS-Mexico), a population-based case-control study in Mexico [20]. The COLUMBUS substudies (Colombia and Mexico) were analyzed as separate datasets given differences in study populations and genotyping methods.
The Peru Genetics and Genomics of Breast Cancer Study (PEGEN-BC), a case-series from a Peruvian cancer center. Unrelated Peruvian individuals from 1000 Genomes [27] were used as controls.
The City of Hope Clinical Cancer Genetics Community Research Network (COH/CCGCRN), the Southern California site of a multisite cancer center and community-based registry for familial breast cancer [28].

View this table:

Table 1.

Participant characteristics by study and case-control status

All studies obtained local institutional review board approval and written informed consent from participants.

Genotyping and genetic ancestry

For all studies except COH/CCGCRN, genotyping was performed using high-density arrays (Table S1). Genotyping of COH/CCGCRN samples was performed using next-generation sequencing with a targeted capture kit that included all 89 SNPs identified as of 2016, prior to publication of the OncoArray GWAS results [3]. Further information about genotyping is provided in the Supplementary Methods.

We estimated genetic ancestry from genome-wide markers using the program ADMIXTURE [29] in unsupervised mode with a model including 4 ancestral populations: European, Indigenous American (IA), African, and East Asian. We used genotype data from 90 European Americans (CEU) and 90 Nigerian Yorubans (YRI) from HapMap [30] to represent European and African populations, respectively. We also included a subset of 504 East Asian individuals from 1000 Genomes [27] and 71 Indigenous Americans previously genotyped on the Affymetrix Axiom LAT1 array [31, 32]. Women with >75% East Asian ancestry were excluded given that the limited influence of the East Asian component in the Hispanic/Latino population did not allow for a subgroup analysis.

Polygenic risk score

We used a 180-SNP PRS for our primary analysis (Table S2). SNP selection is discussed in further detail in the Supplementary Methods. We performed sensitivity analyses of different imputation r² cutoffs for inclusion of SNPs in our PRS (Table S3). We ultimately included all SNPs regardless of imputation quality, as we did not find substantive differences in the associations between the 180-SNP PRS and a 168-SNP PRS constructed using an imputation r² threshold of > 0.5.

Since targeted genotyping was performed within COH/CCGCRN samples, genotypes were available for 89 SNPs. We dropped 1 SNP due to missingness. Of the remaining 88 SNPs, 63 overlapped and 8 had LD proxies (r² > 0.7) with the 180 SNPs comprising the main PRS. We used these 71 SNPs to construct a PRS within the COH/CCGCRN dataset. We then constructed a comparator 71-SNP PRS in the 7 remaining datasets using the 63 shared SNPs and 8 respective LD proxies, and pooled all 8 datasets to evaluate the performance of the 71-SNP PRS.

We constructed the PRS as previously described [7, 33]. Briefly, the PRS represents the product of the likelihood ratios across multiple SNPs, assuming each SNP exerts an independent effect. The likelihood ratio for each SNP was calculated based on the number of risk alleles present, and the allele frequency and effect size (odds ratio, OR) of the risk allele. We used risk allele frequencies derived from the Latin American (AMR) population in 1000 Genomes [27] and published ORs [3]. The latter predominantly reflects the effect of the SNP within a European population, except for those discovered in Latina studies (Table S2) [20, 21].

Statistical analysis

First, we tested the associations between individual SNPs and breast cancer risk using multivariable logistic regression models adjusted for genetic ancestry and study. We used METAL [34] to perform inverse variance based meta-analysis of 180 SNPs across 3 studies: COLUMBUS-Colombia, COLUMBUS-Mexico, and pooled SFBCS/NC-BCFR, Kaiser RPGEH, MEC, CAMA, and PEGEN-BC studies.

To test the associations of PRS with breast cancer, we adjusted for genetic ancestry and study, given that both remained independently associated with breast cancer risk when included in the same model as the PRS. We first performed linear regression of study and ancestry on the PRS (dependent variable). We then used the residual as the main predictor in univariate logistic regression with breast cancer as the outcome. We analyzed the residual as a continuous variable normalized to the mean and standard deviation (SD) in controls. We tested the discrimination of the adjusted PRS by estimating the area under the receiver operating characteristic curve (AUROC). We also tested calibration using the Hosmer-Lemeshow test across deciles of the adjusted PRS, with the 40-50^th and 50-60^th deciles combined and used as the reference group.

To examine the ancestry-specific performance of the PRS, we divided the pooled dataset into quartiles of IA ancestry. We performed logistic regression within each quartile of IA ancestry and compared the resulting coefficients using a Wald test of linear hypothesis. To compare AUROC estimates, we performed a test of equality of AUROC as described by DeLong [35]. Given differences in the population structures between U.S. Latina and Latin-American studies, we also examined ancestry-specific performance of the PRS by geographic origin of study, specifically U.S. (SFBCS/NC-BCFR, RPGEH, MEC) versus Latin-American (CAMA, COLUMBUS, PEGEN-BC).

All tests for significance used two-sided α = 0.05. We developed the script to calculate the PRS using R (The R Foundation). We performed all statistical analyses using Stata 14.1 (StataCorp, College Station, TX).

Results

Study characteristics

Our pooled data included 13,631 women from 8 studies, for a total of 5,697 cases and 7,934 controls (Table 1). Across all studies, ancestry was mostly European and Indigenous American (IA). There was substantial variation in ancestry within and across studies (Supplementary Figure S2). For instance, PEGEN-BC in Peru had the highest average IA ancestry (76% in cases and controls) while RPGEH in Northern California had the lowest (27% in cases, 29% in controls). Within each study, cases tended to have similar or lower IA ancestry than controls, as we have previously reported [36, 37]. In the pooled analysis, cases had higher IA ancestry since nearly half the controls came from RPGEH, the study with the lowest IA ancestry.

Association of PRS with breast cancer risk

We first examined the associations between individual SNPs and breast cancer risk. Of 180 SNPs tested, 140 had associations that were directionally consistent with those reported in European populations (Table S2) [3]. Forty-eight SNPs were nominally significant (p < 0.05) in our dataset, with 43 being also directionally consistent. Six SNPs remained significant to p < 2.8×10^-4 after Bonferroni correction for multiple testing. Thirteen SNPs displayed heterogeneous associations across studies (P_het < 0.05). For both PRSs, the mean unadjusted PRS was higher in cases than controls (Table 1, Figure S1).

Our main analysis evaluated the performance of a 180-SNP PRS in 12,287 women (4,658 cases and 7,629 controls) from 7 studies, excluding COH/CCGCRN given that 89 SNPs were genotyped in that study. After normalization and adjustment for genetic ancestry and study, the 180-SNP PRS was strongly associated with breast cancer risk, OR per SD increment = 1.58 (95% CI 1.52 to 1.64) (Table 2). The associations with breast cancer were especially pronounced among extremes of the PRS. Compared with women with a PRS in the 40-60^th percentile, women with a PRS in the bottom decile had an OR of 0.44 (95% CI 0.37 to 0.53), while those with a PRS in the top decile had an OR of 2.03 (95% 1.79 to 2.31). The AUROC for the 180-SNP PRS was 0.63 (95% CI 0.62 to 0.64), Figure 1A. The Hosmer-Lemeshow test suggested good fit, with χ² = 8.06 (p = 0.53), Figure 2A.

Figure 1.

Receiver operating characteristic curves for two polygenic risk scores. The 180-SNP PRS (A) had AUROC = 0.63 (95% CI 0.62 to 0.64) in 7 datasets, excluding COH/CCGCRN (n = 12,287). The 71-SNP PRS (B) had AUROC = 0.61 (95% CI 0.61 to 0.62) in all datasets (n = 13,631).

Figure 2.

Calibration plots for: (A) the 180-SNP PRS in 7 datasets, excluding COH/CCGCRN (n = 12,287) and (B) the 71-SNP PRS (B) in all datasets (n = 13,631). Graph depicts predicted versus observed proportions of cases within each decile of the log-normalized PRS. Each circle corresponds to a decile of the PRS, with the middle (largest) circle representing the 40-60^th percentile. Hosmer-Lemeshow p-value = 0.53 for 180-SNP PRS and 0.76 for 71-SNP PRS.

View this table:

Table 2.

Association between 180-SNP and 71-SNP PRS and breast cancer risk

Our secondary analysis evaluated the performance of a 71-SNP PRS in 13,631 women (5,697 cases and 7,934 controls) from 8 studies, including COH/CCGCRN. The 71-SNP PRS had a similar, albeit slightly weaker, association with breast cancer risk (Table 2, Figure 1B), while the Hosmer-Lemeshow test was again suggestive of good fit, χ² = 5.82 (p = 0.76), Figure 2B.

Performance of PRS by Indigenous American ancestry

The 180-SNP PRS displayed similar performance regardless of IA ancestry, with comparable ORs and AUROCs across the top (>55%) and bottom (<29%) quartiles of IA ancestry (Table 3). In contrast, the 71-SNP PRS performed worse in the top compared to the bottom quartile, [OR 1.46 (95% CI 1.36 to 1.56) vs OR 1.68 (95% CI 1.54 to 1.83), p = 0.01]. This corresponded to top versus bottom quartile AUROCs of 0.61 (95% CI 0.59 to 0.63) and 0.64 (95% CI 0.62 to 0.66), respectively (p = 0.02).

View this table:

Table 3.

Area under the receiver operating characteristic curve and odds ratios per standard deviation of the 71-SNP PRS and 180-SNP PRS in Hispanics, by quartiles of Indigenous American ancestry

Given differences in ancestry structure between U.S. Latinas and Latin-American women, we stratified the analysis by geographic origin of study. Among 7,427 women from the U.S. studies (SFBCS/NC-BCFR, RPGEH, and MEC), the 180-SNP PRS performed best in the bottom quartile of IA ancestry (Table S4). However, among the 4,970 women from the Latin-American studies (CAMA, PEGEN-BC, COLUMBUS), the 180-SNP PRS performed similarly across quartiles of IA ancestry (Table S5).

Discussion

We found that PRSs primarily consisting of SNPs identified in European populations were predictive of breast cancer risk in Latinas. Our 180-SNP PRS had an adjusted OR of 1.58 (95% CI 1.52 to 1.64) and an AUROC of 0.63 (95% CI 0.62 to 0.64). These results are comparable to those of European studies, which tested PRSs including 77 to 3820 SNPs and reported ORs per SD between 1.46-1.66 and AUROCs between 0.60-0.64 [5, 38]. Our 71-SNP PRS performed worse than the 180-SNP PRS, though the difference was modest.

Ours is the largest study to date on breast cancer PRS in Latinas and extends the literature by refining estimates of PRS performance in this population. Allman, et al [15] reported that a 71-SNP PRS had an OR per SD increment of 1.39 (95% CI 1.18 to 1.64) and AUROC of 0.59 (95% CI 0.54 to 0.64) among U.S. Latinas. Their 71-SNP PRS shares 62 SNPs (two by LD proxy) with our 71-SNP PRS, including a SNP (rs140068132) previously identified in Latina GWAS [20]. Given the degree of overlap in PRS composition, our results likely represent a more precise estimate of PRS performance, given we had substantially more cases (4,658 vs. 147) and controls (7,629 vs. 3,201) representing wider Latina/Latin-American ancestry. Indeed, our observed ORs and AUROCs for the 71- and 180-SNP PRSs fall in the upper range of the confidence interval reported by Allman.

We could not definitively determine whether PRS performance varies by ancestry. Differential PRS performance by genetic ancestry might be expected for two reasons: first, differences in LD structures between European and non-European populations can attenuate the associations between GWAS hits discovered in Europeans and causal SNPs in LD; secondly, causal alleles may only be present in certain populations. However, the 180-SNP PRS performed similarly across quartiles of IA ancestry. In contrast, the 71-SNP performed better in the bottom quartile of IA ancestry, corresponding to higher European ancestry. One explanation for the latter finding could be that analysis of the 71-SNP PRS analysis included 1,039 additional cases from COH/CCGCRN and therefore had greater statistical power to detect differences in performance by IA ancestry.

A major strength of our study was the size and diversity of our study population. Additionally, we accounted for genetic ancestry, which can bias associations in genetic studies [39]. Given that ancestry was a confounder and an independent predictor of breast cancer risk, we used a novel approach to calculate an “ancestry-adjusted” PRS. We also examined PRS performance by IA ancestry, which has not been previously done. Another strength was the inclusion of several large, diverse breast cancer studies representing populations from several geographic areas (Western U.S., Central and South America) and including women with varying degrees of IA versus European ancestry.

Our results should be interpreted in light of three limitations. First, the generalizability of our findings is limited to Latina populations with similar distributions of genetic ancestry, although the ancestry composition of our study resembled that of other large studies of Latinas from the western U.S. and Central/South America [19, 40]. However, our results may not be generalizable to Caribbean Latinas, whose population structures have a higher proportion of African ancestry [17–19]. We did not test the performance of PRS according to African ancestry given that our study population predominantly consisted of women originating from Latin American countries, where African ancestry is limited. Secondly, our analysis included women recruited from community-based and familial breast cancer clinics and may include moderate or high-penetrance mutation carriers. While PRS is associated with breast cancer risk in mutation carriers and women with elevated familial risk, the magnitudes of these associations vary slightly from those in the average-risk population [41]. Finally, we tested a PRS containing 180 SNPs representing all known GWAS hits at the time of analysis. However, others have constructed expanded PRSs comprising 313 and 3820 SNPs by including SNPs that did not have genome-wide significant associations with breast cancer [38]. Though these expanded PRSs performed better than a 77-SNP PRS, there was little difference in performance between the 313-SNP and 3820-SNP PRSs [38]. We included only SNPs with genome-wide significant associations in our PRS since we reasoned that these signals may be more robust across ancestry. The AUROC for our 180-SNP PRS (0.63) was similar to that of the 313-SNP PRS [38].

Our results suggest that the PRS has predictive value in Latinas, a large and rapidly-growing population in the U.S. Although studies on the ability of the PRS to inform decisions around screening and prevention are underway [9], several commercial genetic testing laboratories are already returning PRS results to women of European descent who tested negative for deleterious mutations. If this practice were extended to Latinas, one could expect the PRS to perform comparably well. Even if the performance of the PRS were slightly attenuated in Latinas of higher Indigenous American ancestry, this does not necessarily preclude its use in this population. Instead, results could account for this attenuation and model the joint effects of PRS and ancestry.

Though our findings lend optimism to the utility of the PRS in predicting breast cancer risk among Latinas, they do not nullify the prospect of disparities in genetic discovery research [42]. Whereas we studied mostly common variants, rare variants display more geographic clustering [43]. As genetic association studies identify more rare variants, those discovered in European populations will be less generalizable to other populations.

Thus, high-quality genetic studies in non-European populations should remain a priority. Fine-mapping in large datasets may enhance the identification of causal SNPs associated with breast cancer risk. Likewise, GWAS should be intentional about including Latinas, particularly those with higher IA and/or African ancestry. In addition, future studies should prospectively assess prediction and examine the contribution of PRS to clinical risk models. Though one such trial is currently using the PRS to tailor decision-making around breast cancer screening and prevention [9], similar clinical effectiveness studies should also aim to recruit diverse women.

Notes

We thank the participants of the SFBCS/NC-BCFR, Kaiser Permanente RPGEH, MEC, CAMA, COLUMBUS, PEGEN-BC, and COH-CCGCRN studies. The authors declare no competing interests.

The contributors from the COLUMBUS Consortium (in alphabetical order) include: Jennyfer Benavides (Universidad del Tolima, Ibagué, Colombia), Mabel Bohorquez (Universidad del Tolima, Ibagué, Colombia), Fernando Bolaños (Hospital Hernando Moncaleano Perdomo, Neiva, Colombia), Luis G Carvajal-Carmona (Universidad del Tolima, Ibagué, Colombia, University of California Comprehensive Cancer Center, Sacramento, USA, Fundación de Genética y Genómica, Medellín, Colombia, Genome Center and Department of Biochemistry and Molecular Medicine, University of California, Davis, Davis, CA, USA), Jenny Carmona (Dinámica IPS, Medellín, Colombia), Ángel Criollo (Universidad del Tolima, Ibagué, Colombia), Magdalena Echeverry (Universidad del Tolima, Ibagué, Colombia), Ana Estrada (Universidad del Tolima, Ibagué, Colombia),Gilbert Mateus (Hospital Federico Lleras Acosta, Ibagué, Colombia), Raúl Murillo (Pontificia Universidad Javeriana, Bogotá, Colombia), Justo Ramirez (Hospital Hernando Moncaleano Perdomo, Neiva, Colombia), Yesid Sánchez (Universidad del Tolima, Ibagué, Colombia), Carolina Sanabria (Instituto Nacional de Cancerología, Bogotá, Colombia), Martha Lucia Serrano (Instituto Nacional de Cancerología, Bogotá, Colombia), John Jairo Suarez (Universidad del Tolima, Ibagué, Colombia), Alejandro Vélez (Dinámica IPS, Medellín, Colombia, Hospital Pablo Tobón Uribe, Medellín, Colombia).

Funding

This work was funded in part by grants from the National Cancer Institute (K24CA169004, R01CA120120 to E.Z. and R01CA184545 to E.Z. and S.N.). Y.S. was supported by the National Center for Advancing Translational Sciences of the NIH under award KL2TR001870.

The Northern California Breast Cancer Family Registry was supported by grant UM1 CA164920 from the National Cancer Institute. The San Francisco Bay Area Breast Cancer Study was funded by grants CA063446 and CA077305 from the National Cancer Institute, grant DAMD17-96-1-6071 from the U.S. Department of Defense, and grant 7PB-0068 from the California Breast Cancer Research Program.

The Kaiser Permanente Research Program on Genes, Environment and Health was supported by Kaiser Permanente national and regional Community Benefit programs, and grants from the Ellison Medical Foundation, the Wayne and Gladys Valley Foundation, and the Robert Wood Johnson Foundation. Genotyping in the GERA cohort was supported by grant RC2 AG03667 from the National Institutes of Health.

The Multiethnic Cohort Study was supported by the National Institutes of Health grants R01 CA63464 and R37 CA54281, R01 CA132839, 5UM1CA164973.

The CAMA Study was funded by Consejo Nacional de Ciencia y Tecnología (SALUD-2002-C01-7462).

The PEGEN-BC study was supported by the National Cancer Institute [R01CA204797 (L.F.)] and the Instituto Nacional de Enfermedades Neoplásicas (Lima, Peru).

The COLUMBUS Consortium was supported by grants from School of Medicine (Dean’s Fellowship in Precision Health Equity to LGC-C) and support from the Office of the Provost for LGC-C’s Latino Cancer Health Equity Initiative); The V Foundation for Cancer Research (V Foundation Scholarship to LGC-C); GSK Oncology (Ethnic Research Initiative to LGCC and ME); The U.S. National Institutes of Health (Cancer Center Support Grant P30CA093372 from the National Cancer Institute). LGC-C, MEB and ME are also grateful for support Colciencias (Graduate Studentship to Jeniffer Benavides, member of COLUMBUS, from Convocatoria para la Formación de Capital Humano de Alto Nivel para el Departamento de Tolima-COLCIENCIAS – 755/2016), Universidad del Tolima (Grants to MEB and ME, project 10112), and Sistema Nacional de Regalías, Gobernación del Tolima (Grants to MEB and ME, project 520115). JT was supported by Coordinacion Nacional de Investigación en Salud, IMSS, México, grant FIS/IMSS/PROT/PRIO/13/027 and by the Consejo Nacional de Ciencia y Tecnologia (Fronteras de la Ciencia grant 773), México.

The study funders and sponsors did not participate in the collection, analysis, or interpretation of data, or in the writing of the manuscript. The contents of this article are solely the responsibility of the authors and do not reflect the official views of the National Institutes of Health.

Footnotes

↵* co-first authors
↵† co-senior authors

References

1.↵
Michailidou K, Hall P, Gonzalez-Neira A, et al. Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nat Genet 2013;45(4):353–61, 361e1-2.
OpenUrl CrossRef PubMed
2.
Michailidou K, Beesley J, Lindstrom S, et al. Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer. Nat Genet 2015;47(4):373–80.
OpenUrl CrossRef PubMed
3.↵
Michailidou K, Lindstrom S, Dennis J, et al. Association analysis identifies 65 new breast cancer risk loci. Nature 2017;551(7678):92–94.
OpenUrl
4.↵
Lilyquist J, Ruddy KJ, Vachon CM, et al. Common Genetic Variation and Breast Cancer Risk-Past, present, and future. Cancer Epidemiol Biomarkers Prev 2018; doi:10.1158/1055-9965.Epi-17-1144.
OpenUrl CrossRef
5.↵
Mavaddat N, Pharoah PDP, Michailidou K, et al. Prediction of Breast Cancer Risk Based on Profiling With Common Genetic Variants. Journal of the National Cancer Institute 2015;107(5).
6.↵
Vachon CM, Pankratz VS, Scott CG, et al. The contributions of breast density and common genetic variation to breast cancer risk. J Natl Cancer Inst 2015;107(5).
7.↵
Shieh Y, Hu D, Ma L, et al. Breast cancer risk prediction using a clinical risk model and polygenic risk score. Breast Cancer Res Treat 2016;159(3):513–25.
OpenUrl
8.↵
Cuzick J, Brentnall AR, Segal C, et al. Impact of a Panel of 88 Single Nucleotide Polymorphisms on the Risk of Breast Cancer in High-Risk Women: Results From Two Randomized Tamoxifen Prevention Trials. J Clin Oncol 2017;35(7):743–750.
OpenUrl
9.↵
Shieh Y, Eklund M, Madlensky L, et al. Breast Cancer Screening in the Precision Medicine Era: Risk-Based Screening in a Population-Based Trial. JNCI: Journal of the National Cancer Institute 2017;109(5):djw290–djw290.
OpenUrl CrossRef PubMed
10.↵
Hughes E, Judkins T, Wagner S, et al. Development and validation of a residual risk score to predict breast cancer risk in unaffected women negative for mutations on a multi-gene hereditary cancer panel. Journal of Clinical Oncology 2017;35(15_suppl):1579–1579.
OpenUrl
11.↵
Black MH, Li S, LaDuca H, et al. Polygenic risk score for breast cancer in high-risk women. Journal of Clinical Oncology 2018;36(15_suppl):1508–1508.
OpenUrl
12.↵
Park SL, Cheng I, Haiman CA. Genome-Wide Association Studies of Cancer in Diverse Populations. Cancer Epidemiol Biomarkers Prev 2018;27(4):405.
OpenUrl Abstract/FREE Full Text
13.↵
Fejerman L, Stern MC, Ziv E, et al. Genetic ancestry modifies the association between genetic risk variants and breast cancer risk among Hispanic and non-Hispanic white women. Carcinogenesis 2013;34(8):1787–1793.
OpenUrl CrossRef PubMed
14.↵
Martin AR, Kanai M, Kamatani Y, et al. Hidden ‘risk’ in polygenic scores: clinical use today could exacerbate health disparities. bioRxiv 2018, http://biorxiv.org/content/early/2018/10/11/441261.abstract.
15.↵
Allman R, Dite GS, Hopper JL, et al. SNPs and breast cancer risk prediction for African American and Hispanic women. Breast Cancer Research and Treatment 2015;154(3):583–589.
OpenUrl
16.↵
United States Census Bureau. Facts for Features: Hispanic Heritage Month 2017 [online], https://www.census.gov/newsroom/facts-for-features/2017/hispanic-heritage.html (2018). Accessed 31 October 2018.
17.↵
Bertoni B, Budowle B, Sans M, et al. Admixture in Hispanics: distribution of ancestral population contributions in the Continental United States. Hum Biol 2003;75(1):1–11.
OpenUrl
18.
Ziv E, John EM, Choudhry S, et al. Genetic Ancestry and Risk Factors for Breast Cancer among Latinas in the San Francisco Bay Area. Cancer Epidemiology Biomarkers & Prevention 2006;15(10):1878–1885.
OpenUrl Abstract/FREE Full Text
19.↵
Bryc K, Velez C, Karafet T, et al. Genome-wide patterns of population structure and admixture among Hispanic/Latino populations. Proceedings of the National Academy of Sciences 2010;107(Supplement 2):8954–8961.
OpenUrl Abstract/FREE Full Text
20.↵
Fejerman L, Ahmadiyeh N, Hu D, et al. Genome-wide association study of breast cancer in Latinas identifies novel protective variants on 6q25. Nat Commun 2014;5:5260.
OpenUrl CrossRef PubMed
21.↵
Hoffman J, Fejerman L, Hu D, et al. Identification of novel common breast cancer risk variants at the 6q25 locus among Latinas. Breast Cancer Res 2019;21(1):3.
OpenUrl
22.↵
John EM, Horn-Ross PL, Koo J. Lifetime physical activity and breast cancer risk in a multiethnic population: the San Francisco Bay area breast cancer study. Cancer Epidemiol Biomarkers Prev 2003;12(11 Pt 1):1143–52.
OpenUrl Abstract/FREE Full Text
23.↵
John EM, Hopper JL, Beck JC, et al. The Breast Cancer Family Registry: an infrastructure for cooperative multinational, interdisciplinary and translational studies of the genetic epidemiology of breast cancer. Breast Cancer Res 2004;6(4):R375–89.
OpenUrl CrossRef PubMed Web of Science
24.↵
Kvale MN, Hesselson S, Hoffmann TJ, et al. Genotyping Informatics and Quality Control for 100,000 Subjects in the Genetic Epidemiology Research on Adult Health and Aging (GERA) Cohort. Genetics 2015;200(4):1051–60.
OpenUrl Abstract/FREE Full Text
25.↵
Kolonel LN, Henderson BE, Hankin JH, et al. A multiethnic cohort in Hawaii and Los Angeles: baseline characteristics. Am J Epidemiol 2000;151(4):346–57.
OpenUrl CrossRef PubMed Web of Science
26.↵
Angeles-Llerenas A, Ortega-Olvera C, Perez-Rodriguez E, et al. Moderate physical activity and breast cancer risk: the effect of menopausal status. Cancer Causes Control 2010;21(4):577–86.
OpenUrl CrossRef PubMed Web of Science
27.↵
The Genomes Project Consortium. A global reference for human genetic variation. Nature 2015;526:68.
OpenUrl CrossRef PubMed
28.↵
MacDonald DJ, Blazer KR, Weitzel JN. Extending comprehensive cancer center expertise in clinical cancer genetics and genomics to diverse communities: the power of partnership. J Natl Compr Canc Netw 2010;8(5):615–24.
OpenUrl Abstract/FREE Full Text
29.↵
Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res 2009;19(9):1655–64.
OpenUrl Abstract/FREE Full Text
30.↵
International HapMap Consortium. A haplotype map of the human genome. Nature 2005;437(7063):1299–1320.
OpenUrl CrossRef PubMed Web of Science
31.↵
Galanter JM, Fernandez-Lopez JC, Gignoux CR, et al. Development of a panel of genome-wide ancestry informative markers to study admixture throughout the Americas. PLoS Genet 2012;8(3):e1002554.
OpenUrl CrossRef PubMed Web of Science
32.↵
Drake KA, Torgerson DG, Gignoux CR, et al. A genome-wide association study of bronchodilator response in Latinos implicates rare variants. J Allergy Clin Immunol 2014;133(2):370–8.
OpenUrl CrossRef Web of Science
33.↵
Ziv E, Tice JA, Sprague B, et al. Using Breast Cancer Risk Associated Polymorphisms to Identify Women for Breast Cancer Chemoprevention. PLoS One 2017;12(1):e0168601.
OpenUrl
34.↵
Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 2010;26(17):2190–1.
OpenUrl CrossRef PubMed Web of Science
35.↵
DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1998;44.
36.↵
Fejerman L, Romieu I, John EM, et al. European ancestry is positively associated with breast cancer risk in Mexican women. Cancer Epidemiol Biomarkers Prev 2010;19(4):1074–82.
OpenUrl Abstract/FREE Full Text
37.↵
Fejerman L, John EM, Huntsman S, et al. Genetic ancestry and risk of breast cancer among U.S. Latinas. Cancer Res 2008;68(23):9723–8.
OpenUrl Abstract/FREE Full Text
38.↵
Mavaddat N, Michailidou K, Dennis J, et al. Polygenic Risk Scores for Prediction of Breast Cancer and Breast Cancer Subtypes. Am J Hum Genet 2019;104(1):21–34.
OpenUrl CrossRef
39.↵
Ziv E, Burchard EG. Human population structure and genetic association studies. Pharmacogenomics 2003;4(4):431–41.
OpenUrl CrossRef PubMed Web of Science
40.↵
Conomos MP, Laurie CA, Stilp AM, et al. Genetic Diversity and Association Studies in US Hispanic/Latino Populations: Applications in the Hispanic Community Health Study/Study of Latinos. American journal of human genetics 2016;98(1):165–184.
OpenUrl CrossRef PubMed
41.↵
Kuchenbaecker KB, McGuffog L, Barrowdale D, et al. Evaluation of Polygenic Risk Scores for Breast and Ovarian Cancer Risk Prediction in BRCA1 and BRCA2 Mutation Carriers. J Natl Cancer Inst 2017;109(7).
42.↵
Sirugo G, Williams SM, Tishkoff SA. The Missing Diversity in Human Genetic Studies. Cell 2019;177(1):26–31.
OpenUrl
43.↵
Gravel S, Henn BM, Gutenkunst RN, et al. Demographic history and rare allele sharing among human populations. Proc Natl Acad Sci U S A 2011;108(29):11983–8.
OpenUrl Abstract/FREE Full Text

View the discussion thread.

Posted April 12, 2019.

Download PDF

Supplementary Material

Citation Tools

Subject Area

Epidemiology

Subject Areas

All Articles

Animal Behavior and Cognition (5204)
Biochemistry (11725)
Bioengineering (8728)
Bioinformatics (29135)
Biophysics (14940)
Cancer Biology (12052)
Cell Biology (17363)
Clinical Trials (138)
Developmental Biology (9408)
Ecology (14147)
Epidemiology (2067)
Evolutionary Biology (18272)
Genetics (12223)
Genomics (16773)
Immunology (11844)
Microbiology (28027)
Molecular Biology (11564)
Neuroscience (60841)
Paleontology (451)
Pathology (1864)
Pharmacology and Toxicology (3232)
Physiology (4940)
Plant Biology (10405)
Scientific Communication and Education (1681)
Synthetic Biology (2878)
Systems Biology (7335)
Zoology (1642)

[1] 1.↵
Michailidou K, Hall P, Gonzalez-Neira A, et al. Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nat Genet 2013;45(4):353–61, 361e1-2.
OpenUrl CrossRef PubMed

[2] 2.
Michailidou K, Beesley J, Lindstrom S, et al. Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer. Nat Genet 2015;47(4):373–80.
OpenUrl CrossRef PubMed

[3] 3.↵
Michailidou K, Lindstrom S, Dennis J, et al. Association analysis identifies 65 new breast cancer risk loci. Nature 2017;551(7678):92–94.
OpenUrl

[4] 4.↵
Lilyquist J, Ruddy KJ, Vachon CM, et al. Common Genetic Variation and Breast Cancer Risk-Past, present, and future. Cancer Epidemiol Biomarkers Prev 2018; doi:10.1158/1055-9965.Epi-17-1144.
OpenUrl CrossRef

[5] 5.↵
Mavaddat N, Pharoah PDP, Michailidou K, et al. Prediction of Breast Cancer Risk Based on Profiling With Common Genetic Variants. Journal of the National Cancer Institute 2015;107(5).

[6] 6.↵
Vachon CM, Pankratz VS, Scott CG, et al. The contributions of breast density and common genetic variation to breast cancer risk. J Natl Cancer Inst 2015;107(5).

[7] 7.↵
Shieh Y, Hu D, Ma L, et al. Breast cancer risk prediction using a clinical risk model and polygenic risk score. Breast Cancer Res Treat 2016;159(3):513–25.
OpenUrl

[8] 8.↵
Cuzick J, Brentnall AR, Segal C, et al. Impact of a Panel of 88 Single Nucleotide Polymorphisms on the Risk of Breast Cancer in High-Risk Women: Results From Two Randomized Tamoxifen Prevention Trials. J Clin Oncol 2017;35(7):743–750.
OpenUrl

[9] 9.↵
Shieh Y, Eklund M, Madlensky L, et al. Breast Cancer Screening in the Precision Medicine Era: Risk-Based Screening in a Population-Based Trial. JNCI: Journal of the National Cancer Institute 2017;109(5):djw290–djw290.
OpenUrl CrossRef PubMed

[10] 10.↵
Hughes E, Judkins T, Wagner S, et al. Development and validation of a residual risk score to predict breast cancer risk in unaffected women negative for mutations on a multi-gene hereditary cancer panel. Journal of Clinical Oncology 2017;35(15_suppl):1579–1579.
OpenUrl

[11] 11.↵
Black MH, Li S, LaDuca H, et al. Polygenic risk score for breast cancer in high-risk women. Journal of Clinical Oncology 2018;36(15_suppl):1508–1508.
OpenUrl

[12] 12.↵
Park SL, Cheng I, Haiman CA. Genome-Wide Association Studies of Cancer in Diverse Populations. Cancer Epidemiol Biomarkers Prev 2018;27(4):405.
OpenUrl Abstract/FREE Full Text

[13] 13.↵
Fejerman L, Stern MC, Ziv E, et al. Genetic ancestry modifies the association between genetic risk variants and breast cancer risk among Hispanic and non-Hispanic white women. Carcinogenesis 2013;34(8):1787–1793.
OpenUrl CrossRef PubMed

[14] 14.↵
Martin AR, Kanai M, Kamatani Y, et al. Hidden ‘risk’ in polygenic scores: clinical use today could exacerbate health disparities. bioRxiv 2018, http://biorxiv.org/content/early/2018/10/11/441261.abstract.

[15] 15.↵
Allman R, Dite GS, Hopper JL, et al. SNPs and breast cancer risk prediction for African American and Hispanic women. Breast Cancer Research and Treatment 2015;154(3):583–589.
OpenUrl

[16] 16.↵
United States Census Bureau. Facts for Features: Hispanic Heritage Month 2017 [online], https://www.census.gov/newsroom/facts-for-features/2017/hispanic-heritage.html (2018). Accessed 31 October 2018.

[17] 17.↵
Bertoni B, Budowle B, Sans M, et al. Admixture in Hispanics: distribution of ancestral population contributions in the Continental United States. Hum Biol 2003;75(1):1–11.
OpenUrl

[18] 18.
Ziv E, John EM, Choudhry S, et al. Genetic Ancestry and Risk Factors for Breast Cancer among Latinas in the San Francisco Bay Area. Cancer Epidemiology Biomarkers & Prevention 2006;15(10):1878–1885.
OpenUrl Abstract/FREE Full Text

[19] 19.↵
Bryc K, Velez C, Karafet T, et al. Genome-wide patterns of population structure and admixture among Hispanic/Latino populations. Proceedings of the National Academy of Sciences 2010;107(Supplement 2):8954–8961.
OpenUrl Abstract/FREE Full Text

[20] 20.↵
Fejerman L, Ahmadiyeh N, Hu D, et al. Genome-wide association study of breast cancer in Latinas identifies novel protective variants on 6q25. Nat Commun 2014;5:5260.
OpenUrl CrossRef PubMed

[21] 21.↵
Hoffman J, Fejerman L, Hu D, et al. Identification of novel common breast cancer risk variants at the 6q25 locus among Latinas. Breast Cancer Res 2019;21(1):3.
OpenUrl

[22] 22.↵
John EM, Horn-Ross PL, Koo J. Lifetime physical activity and breast cancer risk in a multiethnic population: the San Francisco Bay area breast cancer study. Cancer Epidemiol Biomarkers Prev 2003;12(11 Pt 1):1143–52.
OpenUrl Abstract/FREE Full Text

[23] 23.↵
John EM, Hopper JL, Beck JC, et al. The Breast Cancer Family Registry: an infrastructure for cooperative multinational, interdisciplinary and translational studies of the genetic epidemiology of breast cancer. Breast Cancer Res 2004;6(4):R375–89.
OpenUrl CrossRef PubMed Web of Science

[24] 24.↵
Kvale MN, Hesselson S, Hoffmann TJ, et al. Genotyping Informatics and Quality Control for 100,000 Subjects in the Genetic Epidemiology Research on Adult Health and Aging (GERA) Cohort. Genetics 2015;200(4):1051–60.
OpenUrl Abstract/FREE Full Text

[25] 25.↵
Kolonel LN, Henderson BE, Hankin JH, et al. A multiethnic cohort in Hawaii and Los Angeles: baseline characteristics. Am J Epidemiol 2000;151(4):346–57.
OpenUrl CrossRef PubMed Web of Science

[26] 26.↵
Angeles-Llerenas A, Ortega-Olvera C, Perez-Rodriguez E, et al. Moderate physical activity and breast cancer risk: the effect of menopausal status. Cancer Causes Control 2010;21(4):577–86.
OpenUrl CrossRef PubMed Web of Science

[27] 27.↵
The Genomes Project Consortium. A global reference for human genetic variation. Nature 2015;526:68.
OpenUrl CrossRef PubMed

[28] 28.↵
MacDonald DJ, Blazer KR, Weitzel JN. Extending comprehensive cancer center expertise in clinical cancer genetics and genomics to diverse communities: the power of partnership. J Natl Compr Canc Netw 2010;8(5):615–24.
OpenUrl Abstract/FREE Full Text

[29] 29.↵
Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res 2009;19(9):1655–64.
OpenUrl Abstract/FREE Full Text

[30] 30.↵
International HapMap Consortium. A haplotype map of the human genome. Nature 2005;437(7063):1299–1320.
OpenUrl CrossRef PubMed Web of Science

[31] 31.↵
Galanter JM, Fernandez-Lopez JC, Gignoux CR, et al. Development of a panel of genome-wide ancestry informative markers to study admixture throughout the Americas. PLoS Genet 2012;8(3):e1002554.
OpenUrl CrossRef PubMed Web of Science

[32] 32.↵
Drake KA, Torgerson DG, Gignoux CR, et al. A genome-wide association study of bronchodilator response in Latinos implicates rare variants. J Allergy Clin Immunol 2014;133(2):370–8.
OpenUrl CrossRef Web of Science

[33] 33.↵
Ziv E, Tice JA, Sprague B, et al. Using Breast Cancer Risk Associated Polymorphisms to Identify Women for Breast Cancer Chemoprevention. PLoS One 2017;12(1):e0168601.
OpenUrl

[34] 34.↵
Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 2010;26(17):2190–1.
OpenUrl CrossRef PubMed Web of Science

[35] 35.↵
DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1998;44.

[36] 36.↵
Fejerman L, Romieu I, John EM, et al. European ancestry is positively associated with breast cancer risk in Mexican women. Cancer Epidemiol Biomarkers Prev 2010;19(4):1074–82.
OpenUrl Abstract/FREE Full Text

[37] 37.↵
Fejerman L, John EM, Huntsman S, et al. Genetic ancestry and risk of breast cancer among U.S. Latinas. Cancer Res 2008;68(23):9723–8.
OpenUrl Abstract/FREE Full Text

[38] 38.↵
Mavaddat N, Michailidou K, Dennis J, et al. Polygenic Risk Scores for Prediction of Breast Cancer and Breast Cancer Subtypes. Am J Hum Genet 2019;104(1):21–34.
OpenUrl CrossRef

[39] 39.↵
Ziv E, Burchard EG. Human population structure and genetic association studies. Pharmacogenomics 2003;4(4):431–41.
OpenUrl CrossRef PubMed Web of Science

[40] 40.↵
Conomos MP, Laurie CA, Stilp AM, et al. Genetic Diversity and Association Studies in US Hispanic/Latino Populations: Applications in the Hispanic Community Health Study/Study of Latinos. American journal of human genetics 2016;98(1):165–184.
OpenUrl CrossRef PubMed

[41] 41.↵
Kuchenbaecker KB, McGuffog L, Barrowdale D, et al. Evaluation of Polygenic Risk Scores for Breast and Ovarian Cancer Risk Prediction in BRCA1 and BRCA2 Mutation Carriers. J Natl Cancer Inst 2017;109(7).

[42] 42.↵
Sirugo G, Williams SM, Tishkoff SA. The Missing Diversity in Human Genetic Studies. Cell 2019;177(1):26–31.
OpenUrl

[43] 43.↵
Gravel S, Henn BM, Gutenkunst RN, et al. Demographic history and rare allele sharing among human populations. Proc Natl Acad Sci U S A 2011;108(29):11983–8.
OpenUrl Abstract/FREE Full Text

A polygenic risk score for breast cancer in U.S. Latinas and Latin-American women

Abstract

Introduction

Methods

Participants

Genotyping and genetic ancestry

Polygenic risk score

Statistical analysis

Results

Study characteristics

Association of PRS with breast cancer risk

Performance of PRS by Indigenous American ancestry

Discussion

Notes

Funding

Footnotes

References

Citation Manager Formats

Subject Area