Genetics of educational attainment aid in identifying biological subcategories of schizophrenia

Vikas Bansal; Marina Mitjans; Casper A.P. Burik; Richard Karlsson Linnér; Aysu Okbay; Cornelius A. Rietveld; Martin Begemann; Stefan Bonn; Stephan Ripke; Michel G. Nivard; Hannelore Ehrenreich; Philipp D. Koellinger

doi:10.1101/114405

ABSTRACT

Higher educational attainment (EA) is known to have a protective effect regarding the severity of schizophrenia (SZ). However, recent studies have found a small positive genetic correlation between EA and SZ. Here, we investigate possible causes of this counterintuitive finding using genome-wide association results for EA and SZ (n = 443,581) and a replication cohort (1,169 controls and 1,067 cases) with high-quality SZ phenotypes. We find strong genetic overlap between EA and SZ that cannot be explained by chance, linkage disequilibrium, or assortative mating. Instead, our results suggest that the current clinical diagnosis of SZ comprises at least two disease subtypes with non-identical symptoms and genetic architectures: One part resembles bipolar disorder (BIP) and high intelligence, while the other part is a cognitive disorder that is independent of BIP.

INTRODUCTION

Schizophrenia (SZ) is the collective term used for a severe, highly heterogeneous and costly psychiatric disorder that is caused by a complex interplay of environmental and genetic factors^1–4. The latest genome-wide association study (GWAS) by the Psychiatric Genomics Consortium (PGC) identified 108 genomic loci that are associated with SZ⁵. These 108 loci jointly account for ≈3.4% of the variation on the liability scale to SZ⁵, while all single nucleotide polymorphisms (SNPs) that are currently measured by SNP arrays capture ≈64% (s.e. = 8%) in the variation in liability to the disease⁶. This suggests that most of the genetic variants contributing to the heritability of SZ have very small effects and that they have not been isolated yet. This could be due in part to the fact that the clinical disease classification of SZ spans across many different syndromes (e.g., catatonia, paranoia, grandiosity, difficulty in abstract thinking, thought blocking, social withdrawal, hallucinations) that may not have identical genetic architectures. Therefore, identifying additional genetic variants and understanding through which pathways they influence specific SZ syndromes is an important step in understanding the etiologies of the ‘schizophrenias’⁷. However, GWAS analyses of specific SZ syndromes would require very large sample sizes to be statistically well-powered, and the currently available datasets on deeply phenotyped SZ individuals are not large enough yet for this purpose.

Here, we use an alternative approach that combines data for SZ with another cognitive phenotype that can be studied in very large GWAS samples—educational attainment (EA). The relationship between SZ and EA is peculiar: There are contradictory results on the relationship between SZ and EA from phenotypic and genetic data that can be used as an avenue to further our understanding about SZ. Phenotypic data seem to suggest a negative correlation between EA and SZ⁸. For example, SZ patients with lower EA typically show an earlier age of disease onset, higher levels of psychotic symptomatology, and worsened global cognitive function⁸. In fact, EA has been suggested to be a measure of premorbid function and predictor of outcomes in SZ. Moreover, it has been forcefully argued that retarded intellectual development during childhood and bad school performance should be seen as core features of SZ and early indicators of the disease that precede the development of psychotic symptoms^9,10. Furthermore, credible genetic links between SZ and impaired cognitive performance have been found¹¹.

In contrast to these findings, recent studies using GWAS results identified a small, but positive genetic correlation between EA and SZ^12,13. Here, we explore possible reasons for this contradictory result using the largest, non-overlapping GWAS samples on cognitive traits to date, totaling 443,581 individuals of European descent (the vast majority of observations coming from EA). For follow-up analyses, we use data from an independent replication sample that has exceptionally detailed measures of SZ symptoms, the GRAS (Göttingen Research Association for Schizophrenia) data collection^4,7,14.

As a first step, we used the proxy-phenotype method (PPM) to illustrate the genetic overlap between EA and SZ. As a side-result, this approach may isolate novel empirically plausible candidate genes for SZ, comparable to similar studies using PPM that have demonstrated this for cognitive performance¹⁵, Alzheimer’s disease, intracranial and hippocampal volume¹³, depression and neuroticism¹⁶. PPM is a two-stage approach that increases statistical power by using genetic association results from a large, independent sample for a related phenotype to limit the multiple testing burden for the phenotype of interest¹⁵. Previous evidence suggests a strong genetic overlap between EA and SZ, which implies that EA could be used as a proxy-phenotype for SZ because EA can be studied in much larger samples^13,16. However, compared to the present work, these previous studies used substantially smaller and partially overlapping samples and did not have access to an independent cohort that could be used for replication and follow-up analyses.

There are several possible reasons why EA-associated SNPs may also be associated with SZ. One possibility is that a set of genes that is generally important for all brain-related phenotypes is driving this enrichment. This hypothesis suggests that the set of genetic loci that our proxy-phenotype analysis identifies should be generally enriched for association with all brain-related phenotypes, but not for non-brain-related outcomes. To investigate this possibility, we test genetic loci that are jointly associated with EA and SZ for enrichment across 21 additional traits (Supplementary Note).

Second, enrichment could also be a generic consequence of EA-associated SNPs exhibiting above average linkage disequilibrium (LD) with neighboring SNPs. This would increase the probability that these SNPs “tag” other genetic variants that are associated with SZ, or any other disorder¹². To investigate this possibility, we propose a measure that tests for enrichment beyond what is expected for each EA related SNP given its LD with its neighbors (Supplementary Note).

A third possible cause of strong enrichment and weak genetic correlation is heterogeneity in SZ—i.e., sub-types of the disease having different biological causes and varying genetic correlations with EA. Heterogeneity in the disease may also be a reason why previous studies did not succeed in predicting specific syndromes of SZ using a “normal” polygenic score (PGS) that was derived from large-scale GWAS on SZ, which implicitly assumed that all SZ-associated SNPs influence all syndromes in the same way^4,17. If heterogeneity in the disease is causing the observed enrichment of EA with SZ, the sign concordance pattern of SNPs with both traits may contain relevant information that is pertinent to specific SZ syndromes. We tested this by constructing PGS in our replication cohort with high-quality SZ phenotypes that take the sign concordance of SNPs for EA and SZ into account (Supplementary Note). As a robustness check, we repeat this analysis excluding patients diagnosed with schizoaffective disorder.

A fourth possible cause of enrichment is that other phenotypes are genetically correlated with both EA and SZ. Previous studies indicated a particularly strong positive genetic correlation between SZ and bipolar disorder (BIP), which may influence the genetic overlap of both diseases with related phenotypes such as EA, childhood intelligence (IQ), and neuroticism^12,13,18. We use genome-wide inferred statistics (GWIS) that allow controlling for the genetic correlation between SZ and BIP to investigate how “unique” SZ (controlling for BIP) and “unique” BIP (controlling for SZ) are related to EA, childhood IQ, and neuroticism¹⁸.

A fifth possible cause may be assortative mating, which has been demonstrated both for EA¹⁹ and SZ²⁰. We use simulations to explore if independent assortative mating for the two phenotypes may induce a spurious genetic overlap.

This list of potential causes for the genetic overlap between EA and SZ may not be exhaustive and several of these factors may be at work simultaneously.

RESULTS

Proxy-phenotype analyses

Figure 1 presents an overview of the proxy-phenotype analyses. The first-stage GWAS on EA (Supplementary Note) identified 506 loci that passed our predefined threshold of P_EA < 10⁻⁵ (https://osf.io/dnhfk/); 108 of them were genome-wide significant (P_EA < 5 × 10⁻⁸, see Supplementary Table 5.1). Of the 506 EA lead-SNPs, 132 are associated with SZ at nominal significance (P_SZ < 0.05), and 21 of these survive Bonferroni correction . LD score regression results suggest that the vast majority of the association signal in both the EA¹³ and the SZ⁵ GWAS are truly genetic signals, rather than spurious signals originating from uncontrolled population stratification. Figure 2a shows a Manhattan plot for the GWAS on EA highlighting SNPs that were also significantly associated with SZ (red crosses for P_SZ < 0.05, green crosses for P_SZ = 9.88 × 10⁻⁵).

Figure 1: Workflow of the proxy-phenotype analyses

Figure 2: Results of the proxy-phenotype analyses.

A Q-Q plot of the 506 EA lead SNPs for SZ is shown in Figure 2b. Although the observed sign concordance of 52% is not significantly different from a random pattern (P = 0.40), we find 3.23 times more SNPs in this set of 506 SNPs that are nominally significant for SZ than expected given the distribution of the P values in the SZ GWAS results (raw enrichment P = 6.87 × 10⁻¹⁰, Supplementary Note). The observed enrichment of the 21 EA lead SNPs that pass Bonferroni correction for SZ is even more pronounced (27 times stronger, P = 5.44 × 10⁻¹⁴).

Bayesian credibility of the results

The effect sizes of these 21 SNPs on SZ are small, ranging from Odds = 1.02 (rs4500960) to Odds = 1.11 (rs4378243) after winner’s curse correction (Table 1). However, Bayesian calculations with reasonable prior beliefs (e.g., 1% or 5%, Supplementary Note) suggest that most of these 21 SNPs are likely or virtually certain to be truly associated with SZ.

View this table:

Table 1: SNPs significantly associated with schizophrenia after Bonferroni correction.

Prediction of future genome-wide significant loci for schizophrenia

Of the 21 variants we identified, 12 are in LD with loci previously reported by the PGC⁵ and 2 are in the major histocompatibility complex (MHC) region on chromosome 6 and were therefore not separately reported in that study. Three of the variants we isolated (rs7610856, rs143283559, rs28360516) were independently found in a recent meta-analysis of the PGC results⁵ with another large-scale sample²¹. We show in the Supplementary Note that using EA as a proxy-phenotype for SZ helped to predict the novel genome-wide significant findings reported in that study, which illustrates the power of the proxy-phenotype approach. Furthermore, two of the 21 variants (rs756912, rs7593947) are in LD with loci recently reported in a study that also compared GWAS findings from EA and SZ using smaller samples and a less conservative statistical approach²² (Supplementary Note). The remaining 2 SNPs we identified (rs7336518 on chr13 and rs7522116 on chr 1) add to the list of empirically plausible candidate loci for SZ.

LD-aware enrichment across different traits

Figure 3 and Supplementary Table 5.2 show the LD-aware enrichment of the SNPs that are jointly associated with EA and SZ across 22 traits. We find significant joint LD-aware enrichment of this set of SNPs for SZ, BIP, neuroticism and childhood IQ, and for inflammatory bowel disease and age at menarche. However, we find no LD-aware enrichment for other brain-traits that are phenotypically related to SZ, such as depressive symptoms, subjective well-being, autism, and attention deficit hyperactivity disorder. We also do not find LD-aware enrichment for most traits that are less obviously related to the brain (e.g., BMI, coronary artery disease) and our negative controls (e.g., fasting insulin, birth weight, birth length). Furthermore, one of the novel SNPs we isolated shows significant LD-aware enrichment both for SZ and for BIP (rs7522116).

Figure 3: LD-aware enrichment across traits for SNPs that are jointly associated with EA (P_EA < 10⁻⁵) and SZ (P_SZ < 0.05).

Replication in the GRAS sample

A PGS based on the 132 loci jointly associated with both EA and SZ (SZ_132) adds ΔR² = 7.54% – 7.01% = 0.53% predictive accuracy for the SZ case-control status to a PGS (SZ_all) derived from the GWAS on SZ alone (P = 1.7 × 10⁻⁴, Table 2, Model 3). The SZ_132 score also significantly adds (P = 3.4 × 10⁻⁴) to the predictive accuracy of the SZ case-control status when all other scores we constructed are included as control variables. In addition to SZ_132, PGS for SZ (SZ_all) and for BIP (BIP_all ) also predict case-control status, jointly reaching an adjusted ΔR² of ≈ 9% (Table 2, Model 9 and Supplementary Note).

View this table:

Table 2: Polygenic prediction of schizophrenia status in the GRAS sample.

Polygenic prediction of schizophrenia measures in the GRAS sample

We find that the number of years of education is phenotypically correlated with later age at prodrome, later onset of disease, and less severe disease symptoms among SZ patients in the GRAS sample (Supplementary Note, Supplementary Table 8.1 and Supplementary Fig. 1). The EA_all score is associated with years of education (P = 2.6 × 10⁻⁶) and premorbid IQ (P = 2.3 × 10⁻⁴) among SZ patients (Supplementary Note and Table 3). Consistent with earlier results⁴, we find that none of the SZ measures can be predicted by the normal SZ PGS (SZ_all, Supplementary Table 8.2). Importantly, by utilizing GWAS results from both EA and SZ, we show that it is possible to predict specific features of SZ (Global Assessment of Functioning (GAF), Clinical Global Impression of Severity (CGI-S), and Positive and Negative Syndrome Scale (PANSS)) from genetic data. In a multiple regression analysis²³ that allows a “ceteris paribus” interpretation of the included variables, we find that the EA_all score is associated with less severe disease outcomes only if we condition on the effects of the Concordant and Discordant scores. And conditional on the EA_all score, the Concordant and Discordant scores are associated with more (less) severe positive and negative symptoms as measured by the PANSS scale, respectively (Table 3). The best predictive accuracy of SZ readouts using these scores is currently observed for GAF (R² = 1.38%). Of note, several of the symptoms measured by PANSS are also symptoms of BIP. The degree and composition of symptoms varies with the phase at evaluation (manic or depressive) and the general disease severity. We repeated these analyses excluding patients who were diagnosed with schizoaffective disorder (SD) and found similar results, implying that the genetic heterogeneity in SZ that we identify is not only due to SD (Supplementary Note, Supplementary Table 8.4.a).

View this table:

Table 3: Polygenic risk prediction of schizophrenia outcomes in the GRAS sample.

Controlling for the genetic overlap between schizophrenia and bipolar disorder

None of the EA-associated lead SNPs (P_EA < 5 × 10⁻⁸) are significantly associated with “unique” SZ_{(min BIP)} after Bonferroni correction (Supplementary Table 9.1, Supplementary Note). The sign concordance of the EA lead SNPs with “unique” SZ_{(min BIP)} was 44.5% (P = 0.046). Supplementary Figure 2 shows a Q-Q plot of the EA lead-SNPs for “unique” SZ_{(min BIP)}. Although we find 1.6 times more EA-associated SNPs with P_SZunique < 0.05 than expected by chance (raw enrichment P = 0.02, Supplementary Note), the enrichment is much weaker than in the main SZ GWAS results that did not control for the genetic overlap between SZ and BIP. The genetic correlations between EA SZ_{(min BIP)}, and IQ and SZ_{(min BIP)} are negative and significant (r_g = -0.16, P = 3.88×10⁻⁰⁴ and r_g = -0.31, P = 6.00×10⁻⁰³ respectively), which is in line with the idea of SZ being a cognitive disorder⁹. Furthermore, the genetic correlations of EA and IQ with BIP_{(min SZ)} remain positive and get somewhat stronger (r_g = 0.31, P = 2.87×10⁻⁰⁷ and r_g = 0.33, P = 3.18×10⁻⁰² respectively) compared with the ordinary BIP GWAS results. However, controlling for the genetic overlap of SZ and BIP does not affect the genetic correlations with neuroticism (Figure 4).

Figure 4: Genetic correlations of GWAS and GWIS results that are central to the relationship between SZ and EA.

Simulations of assortative mating

Our simulations for assortative mating were based on relatively extreme assumptions that increased our chance of finding spurious enrichment of EA loci for SZ. The results suggest it is unlikely that assortative mating is a major cause for the genetic overlap we observe between EA and SZ (Supplementary Fig. 3).

Biological annotations

Biological annotation of the 132 SNPs that are jointly associated with EA and SZ using DEPICT identified 111 significant reconstituted gene sets (Supplementary Table 10.1). Pruning these resulted in 19 representative gene sets including dendrites, axon guidance, transmission across chemical synapses, and abnormal cerebral cortex morphology (Supplementary Table 10.2 and Figure 5a). All significantly enriched tissues are related to the nervous system and sense organs (Figure 5b). Furthermore, “Neural Stem Cells” is the only significantly enriched cell-type (Supplementary Table 10.3). DEPICT prioritized genes that are known to be involved in neurogenesis and synapse formation (Supplementary Table 10.4). Some of the genes, including SEMA6D and CSPG5, have been suggested to play a potential role in SZ^24,25. For the two novel candidate SNPs reported in this study (rs7522116 and rs7336518), DEPICT points to the FOXO6 (Forkhead Box O6) and the SLITRK1 (SLIT and NTRK Like Family Member 1) genes, respectively. FOXO6 is predominantly expressed in the hippocampus and has been suggested to be involved in memory consolidation, emotion and synaptic function^26,27. Similarly, SLITRK1 is also highly expressed in the brain²⁸, particularly localized to excitatory synapses and promoting their development²⁹, and it has previously been suggested to be a candidate gene for neuropsychiatric disorders³⁰.

Figure 5: Biological annotation of SNPs that are jointly associated with EA (P_EA < 10⁻⁵) and SZ (P_SZ < 0.05).

DISCUSSION

We explored the genetic overlap between EA and SZ using the largest currently available GWAS sample on human cognitive traits to date. Using EA as a proxy-phenotype, we identified 21 genetic loci for SZ and showed that this approach helps to predict future GWAS hits for SZ. We isolated two additional candidate genes for SZ, FOXO6 and SLITRK1. Our results show that EA-associated SNPs are much more likely to also be associated with SZ than expected by chance. However, these genetic loci do not influence both traits with a systematic sign pattern that would correspond to a strong positive or negative genetic correlation.

The results of our follow-up analyses are most consistent with two hypotheses that complement each other: First, the genetic overlap between EA and SZ is to some extent induced by pleiotropic effects of many genes that affect not only EA and SZ but also other traits such as BIP and IQ. Second, different syndromes of SZ (e.g., low cognitive performance and psychosis) seem to be driven by different genetic effects. The clinical diagnosis of SZ aggregates over these different syndromes. In particular, our results suggest that the current clinical diagnosis of SZ comprises at least two disease subtypes with nonidentical symptomatology and genetic architectures: One part resembles bipolar disorder (BIP) and high intelligence, while the other part is a cognitive disorder that is independent of BIP. Consistent with this idea, we find that PGS that take the sign concordance of SNPs with EA and SZ into account begin for the first time to predict specific SZ features from genetic data (R² between 0.4% and 1.4%), while this was not possible with “ordinary” PGS for SZ.

Other mechanisms that we explored, in particular LD-patterns of the EA-associated SNPs and assortative mating, do not seem to be major drivers of the genetic overlap between EA and SZ. Furthermore, the loci we identified in our PPM analysis do not seem to be associated with all brain-related phenotypes, suggesting some degree of phenotype-specificity of the results. We note that the enrichment for age at menarche of the SNPs that are jointly associated with EA and SZ may be related to the final stage of brain development which coincides with the onset of puberty^31–34.

The highly complex genetic architecture of the “schizophrenias” that our results point to implies that most patients will have individual-specific genetic loads for either subtype of the disease, contributing to individual differences in symptoms. The genetic heterogeneity we identified could imply that treatments will vary in their effectiveness across disease subtypes.

Overall, our study corroborates that EA is a useful proxy-phenotype for psychiatric outcomes. Specifically, combining GWAS results from EA and SZ led to the identification of two seemingly distinct subcategories of SZ. Even though each of them may still harbor highly heterogeneous disease subgroups, the new subcategories can pave the way for further biological subgroup analyses. Therefore, a psychiatric nosology that is based on biological causes rather than pure phenotypical classifications may be feasible in the future. Studies that combine well-powered GWAS from several diseases and from phenotypes that represent variation in the normal range such as EA are likely to play an important part in this development. However, deep phenotyping of large patient samples will be inevitable to link GWAS results from complex outcomes such as EA and SZ to specific biological disease subgroups.

AUTHOR CONTRIBUTIONS

P.D.K. designed and oversaw the study and conducted proxy-phenotype analyses. V.B. and M.M. carried out analyses in the GRAS sample. V.B. conducted bioinformatics and computed the LD-aware enrichment tests, which were developed by M.N. C.A.P.B. conducted simulation analyses. M.N. computed GWIS results and genetic correlations. R.K.L. assisted with biological annotation and visualization of results. P.D.K., V.B., M.M., and H.E made especially major contributions to writing and editing. All authors contributed to and critically reviewed the manuscript.

COMPETING FINANCIAL INTERESTS

The authors declare no conflict of interests.

ONLINE METHODS

All reported statistical results are based on two-sided tests, unless indicated otherwise. Our proxy-phenotype analyses and our replication strategy followed a pre-registered analysis plan (https://osf.io/dnhfk/). The full description of all materials and methods is provided in the Supplementary Note.

GWAS on educational attainment

The EA sample excluded all cohorts that participated in the GWAS on SZ described below, yielding a sample size of n = 363,502 individuals of European descent¹³. The GRAS replication sample was not part of the GWAS on EA, either.

GWAS on schizophrenia

The SZ sample consisted of n = 34,409 cases and n = 45,670 controls, diagnosed with SZ or schizoaffective disorder⁵. We excluded the GRAS data collection from the GWAS on SZ.

Proxy-phenotype look-up

Analyses were carried out using 8,240,280 autosomal SNPs that passed quality controls in both GWAS and additional filters described in the Supplementary Note. We selected approximately independent lead SNPs from the EA GWAS results using the clumping procedure in PLINK³⁵. We looked up the SZ results of all approximately independent EA lead-SNPs that passed the pre-defined significance threshold of P_EA < 10⁻⁵.

We tested if the observed sign concordance between EA and SZ is different from 50% using the binomial probability test³⁶. “Raw” enrichment factors and “raw” enrichment p-values of the EA lead-SNPs on SZ were calculated by taking the actual distribution of P values in the SZ GWAS result files into account but ignoring the LD scores^12,37.

LD-aware enrichment across different traits

We developed an enrichment test that corrects for the LD score of each SNP (Supplementary Note). We conducted this test for the 132 SNPs that are jointly associated with EA and SZ in our proxy-phenotype analyses (P_EA < 10⁻⁵ and P_SZ < 0.05). LD scores were obtained from the HapMap 3 European reference panel. We investigated SZ and 21 additional traits for which GWAS results were available in the public domain. Some of the traits were chosen because they are phenotypically related to SZ (e.g., BIP), while others were less obviously related to SZ (e.g., age at menarche) or served as negative controls (e.g., fasting insulin). If one of the 132 candidate SNP was not available in the reference panel or the GWAS results of the other traits, we tried to use a good proxy, yielding 79 to 105 available SNPs per trait.

Phenotypic correlations

We explored the correlations between the number of years of education with 7 quantitative measures of SZ in the GRAS sample of SZ cases: Age at prodrome, age at disease onset, premorbid IQ, GAF, CGI-S, and PANSS positive and PANSS negative scores.

Replication and Bayesian credibility of results

Our replication uses a PGS in the GRAS data collection, which is based on the 132 independent EA lead-SNPs that are also nominally associated with SZ (P_EA < 10⁻⁵ and P_SZ < 0.05). This PGS (called SZ_132) was constructed using the regression coefficient estimates of the SZ GWAS as weights. In addition to this polygenic replication strategy, we further probed the credibility of our results using a heuristic Bayesian calculation.

Polygenic prediction of schizophrenia symptoms in the GRAS sample

We predicted the number of years of education and 7 quantitative measures of SZ in the GRAS sample of SZ cases. For each phenotype, we separately compared the predictive performance of several PGS: Scores constructed from the full GWAS result on SZ, EA, BIP, and neuroticism (called SZ_all, EA_all, BIP_all, Neuro_all, respectively); scores constructed using only the 132 SNPs that are jointly associated with EA and SZ (called EA_132 and SZ_132, using EA and SZ GWAS coefficients as weights, respectively); and two scores that split the SZ_all score into two parts based on sets of SNPs that either have concordant or discordant effects on EA and SZ (called Concordant and Discordant). Genetic outliers of self-reported non-European descent (n = 13 cases) were excluded from the analysis.

Controlling for the genetic overlap between schizophrenia and bipolar disorder

We estimated GWIS¹⁸ to obtain SNP regression coefficients that are unique to SZ, which are corrected for the genetic overlap between SZ and BIP. The SZ samples used in the GWIS are not overlapping with the samples used in the EA GWAS and they exclude our replication sample (GRAS). BIP GWAS results were obtained from the PGC³⁸. We refer to the set of obtained summary statistics as “unique” SZ_{(min BIP)}. We then repeated the look-up of the EA-associated lead SNPs in those summary statistics as described above. Similarly, we obtained GWIS results for “unique” BIP_{(min SZ)} using the same method and data. We computed genetic correlations of these GWIS results with EA, childhood intelligence, and neuroticism using bivariate LD score regression¹² and compared the results to those obtained using ordinary SZ and BIP GWAS results.

Simulations of assortative mating

We conducted simulations to test if strong assortative mating on EA and SZ can induce a spurious genetic overlap between the two traits.

Biological annotation

To gain first insights into possible biological pathways that are indicated by the genetic loci identified by our PPM analysis, we applied DEPICT^13,39 using a false discovery rate (FDR) threshold of ≤ 0.05. To identify independent biological groupings, we used the Affinity Propagation method based on the Pearson distance matrix for clustering⁴⁰.

Data availability

GWAS meta-analysis results for EA and SZ as well as GWIS results for unique” SZ_{(min BIP)} and “unique” BIP_{(min SZ)} can be downloaded from the SSGAC website (https://www.thessgac.org/). For information about the GRAS data collection, contact the principal investigator of the study: Hannelore Ehrenreich (ehrenreich{at}em.mpg.de).

Code availability

Computer code used to generate LD-aware enrichment and GWIS results can be downloaded from the SSGAC website (https://www.thessgac.org/).

ACKNOWLEDGMENTS

This research was carried out under the auspices of the Social Science Genetic Association Consortium (SSGAC), including use of the UK Biobank Resource. We thank all research consortia that provide access to GWAS summary statistics in the public domain. Specifically, we acknowledge data access from the Psychiatric Genomics Consortium (PGC), the Genetic Investigation of ANthropometric Traits Consortium (GIANT), the International Inflammatory Bowel Disease Genetics Consortium (IIBDGC), the International Genomics of Alzheimer’s Project (IGAP), the CARDIoGRAMplusC4D Consortium, the Reproductive Genetics Consortium (ReproGen), the Tobacco and Genetics Consortium (TAG), the Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC), the ENIGMA Consortium, and the Childhood Intelligence Consortium (CHIC). We would like to thank the customers and employees of 23andMe for making this work possible as well as Joyce J. Tung, Nick. A. Furlotte, and David. A Hinds from the 23andMe research team. This study was supported by funding from an ERC Consolidator Grant (647648 EdGe, Philipp D Koellinger), the Max Planck Society, the Max Planck Förderstiftung, the DFG (CNMPB), EXTRABRAIN EU-FP7, the Niedersachsen-Research Network on Neuroinfectiology (N-RENNT), and EU-AIMS. Michel G Nivard was supported by Royal Netherlands Academy of Science Professor Award to Dorret I Boomsma (PAH/6635). Additional acknowledgements are provided in the Supplementary Online Materials.

Footnotes

↵13 These authors jointly directed this work.
↵^a The actual N per SNP was not provided in the SZ GWAS summary statistics.
↵^b It is typically assumed that GWAS data for European populations contain ≈1,000,000 independent loci. However, the quality-control procedures for GWAS summary statistics in studies like ours decreases the number of independent loci to <1,000,000^41,62. In fact, clumping the post-QC GWAS results for SZ without a P value threshold, an R²_LD<0.1, and a LD-window of 1,000,000 kb with the 1000 Genomes phase 1 version 3 European reference panel⁴² leads to only 223,065 independent loci. Thus, assuming 500,000 independent loci in these calculations is conservative.
↵^c URL: https://www.ebi.ac.uk/gwas/api/search/downloads/full

REFERENCES

↵
Knapp M, Mangalore R, Simon J. The global costs of schizophrenia. Schizophr Bull 2004; 30: 279–293.
OpenUrl CrossRef PubMed Web of Science
Sullivan PF, Kendler KS, Neale MC, KS K, SB T, DB P et al. Schizophrenia as a complex trait. Arch Gen Psychiatry 2003; 60: 1187.
OpenUrl CrossRef PubMed Web of Science
Polderman TJC, Benyamin B, de Leeuw CA, Sullivan PF, van Bochoven A, Visscher PM et al. Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nat Genet 2015; 47: 702–709.
OpenUrl CrossRef PubMed
↵
Stepniak B, Papiol S, Hammer C, Ramin A, Everts S, Hennig L et al. Accumulated environmental risk determining age at schizophrenia onset: a deep phenotyping-based study. The Lancet Psychiatry 2014; 1: 444–453.
OpenUrl
↵
Ripke S, Neale BM, Corvin A, Walters JTR, Farh K-H, Holmans PA et al. Biological insights from 108 schizophrenia-associated genetic loci. Nature 2014; 511: 421–427.
OpenUrl CrossRef PubMed Web of Science
↵
Bhatia G, Gusev A, Loh P, Vilhjálmsson BJ, Ripke S, PGC et al. Haplotypes of common SNPs can explain missing heritability of complex diseases. 2015 http://dx.doi.org/10.1101/022418.
↵
Ehrenreich, H; Mitjans, M; Van der Auwera, S; Centeno, TP; Begemann, M; Grabe, HJ; Bonn, S; Nave K-A. OTTO: a new strategy to extract mental disease-relevant combinations of GWAS hits from individuals. Mol Psychiatry 2017. doi:10.1038/mp.2016.208.
OpenUrl CrossRef
↵
Swanson CL, Gur RC, Bilker W, Petty RG, Gur RE. Premorbid educational attainment in schizophrenia: association with symptoms, functioning, and neurobehavioral measures. Biol Psychiatry 1998; 44: 739–747.
OpenUrl CrossRef PubMed Web of Science
↵
Kahn RS, Keefe RSE, JD H, B E, GM K, H D et al. Schizophrenia is a cognitive illness. JAMA Psychiatry 2013; 70: 1107.
OpenUrl
↵
Kraepelin E. Psychiatrie: Ein Lehrbuch für Studierende und Ärzte. 4th ed. Verlag von Johann Ambrosius Barth: Leipzig, Germany, 1893.
↵
Stefansson H, Meyer-Lindenberg A, Steinberg S, Magnusdottir B, Morgen K, Arnarsdottir S et al. CNVs conferring risk of autism or schizophrenia affect cognition in controls. Nature 2013; 505: 361–366.
OpenUrl
↵
Bulik-Sullivan B, Finucane HK, Anttila V, Gusev A, Day FR, Consortium R et al. An atlas of genetic correlations across human diseases and traits. Nat Genet 2015; 47: 1236–1241.
OpenUrl CrossRef PubMed
↵
Okbay A, Beauchamp JP, Fontana MA, Lee JJ, Pers TH, Rietveld CA et al. Genome-wide association study identifies 74 loci associated with educational attainment. Nature 2016; 533: 539–542.
OpenUrl CrossRef PubMed
↵
Ribbe K, Friedrichs H, Begemann M, Grube S, Papiol S, Kästner A et al. The cross-sectional GRAS sample: A comprehensive phenotypical data collection of schizophrenic patients. BMC Psychiatry 2010; 10: 91.
OpenUrl CrossRef PubMed
↵
Rietveld CA, Esko TT, Davies G, Pers TH, Turley PA, Benyamin B et al. Common genetic variants associated with cognitive performance identified using the proxy-phenotype method. Proc Natl Acad Sci U S A 2014; 111: 13790–13794.
OpenUrl Abstract/FREE Full Text
↵
Okbay A, Baselmans BML, Neve J-E De, Turley P, Nivard MG, Fontana MA et al. Genetic variants associated with subjective well-being, depressive symptoms, and neuroticism identified through genome-wide analyses. Nat Genet doi:10.1038/ng.3552.
OpenUrl CrossRef PubMed
↵
Goes FS, McGrath J, Avramopoulos D, Wolyniec P, Pirooznia M, Ruczinski I et al. Genome-wide association study of schizophrenia in Ashkenazi Jews. Am J Med Genet Part B Neuropsychiatr Genet 2015; 168: 649–659.
OpenUrl CrossRef
↵
Nieuwboer HA, Pool R, Dolan CV, Boomsma DI, Nivard MG. GWIS: Genome-wide inferred statistics for functions of multiple phenotypes. Am J Hum Genet 2016; 99: 917–927.
OpenUrl
↵
Hugh-Jones D, Verweij KJH, St. Pourcain B, Abdellaoui A. Assortative mating on educational attainment leads to genetic spousal resemblance for polygenic scores. Intelligence 2016; 59: 103–108.
OpenUrl
↵
Nordsletten AE, Larsson H, Crowley JJ, Almqvist C, Lichtenstein P, Mataix-Cols D et al. Patterns of nonrandom mating within and across 11 major psychiatric disorders. JAMA Psychiatry 2016; 73: 354.
OpenUrl
↵
Pardiñas AF, Holmans P, Pocklington AJ, Escott-Price V, Ripke S, Carrera N et al. Common schizophrenia alleles are enriched in mutation-intolerant genes and maintained by background selection. 2016 http://dx.doi.org/10.1101/068593.
↵
Le Hellard S, Wang Y, Witoelar A, Zuber V, Bettella F, Hugdahl K et al. Identification of gene loci that overlap between schizophrenia and educational attainment. Schizophr Bull 2016. doi:10.1093/schbul/sbw085.
OpenUrl CrossRef PubMed
↵
Wooldridge JM. Multiple Regression Analysis: Estimation. In: Introductory Econometrics: A Modern Approach. Cengage Learning, 2013, pp 70–76.
↵
So H-C, Fong PY, Chen RYL, Hui TCK, Ng MYM, Cherny SS et al. Identification of neuroglycan C and interacting partners as potential susceptibility genes for schizophrenia in a Southern Chinese population. Am J Med Genet Part B Neuropsychiatr Genet 2010; 153B: 103–113.
OpenUrl
↵
Arion D, Horváth S, Lewis DA, Mirnics K. Infragranular gene expression disturbances in the prefrontal cortex in schizophrenia: signature of altered neural development? Neurobiol Dis 2010; 37: 738–46.
OpenUrl CrossRef PubMed Web of Science
↵
Salih DAM, Rashid AJ, Colas D, de la Torre-Ubieta L, Zhu RP, Morgan AA et al. FoxO6 regulates memory consolidation and synaptic function. Genes Dev 2012; 26: 2780–2801.
OpenUrl Abstract/FREE Full Text
↵
Maiese K. FoxO Proteins in the Nervous System. Anal Cell Pathol 2015; 2015: 1–15.
OpenUrl
↵
Aruga J, Yokota N, Mikoshiba K. Human SLITRK family genes: genomic organization and expression profiling in normal brain and brain tumor tissue. Gene 2003; 315: 87–94.
OpenUrl CrossRef PubMed Web of Science
↵
Beaubien F, Raja R, Kennedy TE, Fournier AE, Cloutier J-F, Ichtchenko K et al. Slitrk1 is localized to excitatory synapses and promotes their development. Sci Rep 2016; 6: 27343.
OpenUrl
↵
Proenca CC, Gao KP, Shmelkov S V, Rafii S, Lee FS, Aruga J et al. Slitrks as emerging candidate genes involved in neuropsychiatric disorders. Trends Neurosci 2011; 34: 143–53.
OpenUrl CrossRef PubMed Web of Science
↵
Huttenlocher, R. P. Synapse elimination and plasticity in developing human cerebral cortex. Am J Ment Defic 1984; 88: 488–496.
OpenUrl PubMed Web of Science
Purves D, Lichtman J. Elimination of synapses in the developing nervous system. Science (80-) 1980; 210: 153–157.
OpenUrl Abstract/FREE Full Text
Yazici E, Bursalioglu FS, Aydin N, Yazici AB. Menarche, puberty and psychiatric disorders. Gynecol Endocrinol 2013; 29: 1055–1058.
OpenUrl
↵
Saugstad LF. Age at puberty and mental illness. Towards a neurodevelopmental aetiology of Kraepelin’s endogenous psychoses. Br J Psychiatry 1989; 155: 536–544.
OpenUrl Abstract/FREE Full Text
Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 2015; 4: 7.
OpenUrl CrossRef PubMed
Sheskin D. The binomial sign test for a single sample. In: Handbook of Parametric and Nonparametric Statistical Procedures. Taylor & Francis Group: Boca Raton, 2007, pp 289–311.
Bulik-Sullivan BK, Loh P-R, Finucane HK, Ripke S, Yang J, Patterson N et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet 2015; 47: 291–295.
OpenUrl CrossRef PubMed
Consortium C-DG of the PG. Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. Lancet 2013; 381: 1371–1379.
OpenUrl CrossRef PubMed Web of Science
Pers TH, Karjalainen JM, Chan Y, Westra H-JH-J, Wood AR, Yang J et al. Biological interpretation of genome-wide association studies using predicted gene functions. Nat Commun 2015; 6: 5890.
OpenUrl CrossRef PubMed
Frey BJ, Dueck D. Clustering by passing messages between data points. Science (80-) 2007; 315.

View the discussion thread.

Posted March 08, 2017.

Download PDF

Supplementary Material

Citation Tools

Subject Area

Genetics

Subject Areas

All Articles

Animal Behavior and Cognition (5209)
Biochemistry (11730)
Bioengineering (8743)
Bioinformatics (29179)
Biophysics (14964)
Cancer Biology (12080)
Cell Biology (17399)
Clinical Trials (138)
Developmental Biology (9417)
Ecology (14174)
Epidemiology (2067)
Evolutionary Biology (18294)
Genetics (12233)
Genomics (16791)
Immunology (11858)
Microbiology (28051)
Molecular Biology (11575)
Neuroscience (60919)
Paleontology (451)
Pathology (1870)
Pharmacology and Toxicology (3238)
Physiology (4955)
Plant Biology (10422)
Scientific Communication and Education (1682)
Synthetic Biology (2881)
Systems Biology (7338)
Zoology (1650)

[1] ↵
Knapp M, Mangalore R, Simon J. The global costs of schizophrenia. Schizophr Bull 2004; 30: 279–293.
OpenUrl CrossRef PubMed Web of Science

[2] Sullivan PF, Kendler KS, Neale MC, KS K, SB T, DB P et al. Schizophrenia as a complex trait. Arch Gen Psychiatry 2003; 60: 1187.
OpenUrl CrossRef PubMed Web of Science

[3] Polderman TJC, Benyamin B, de Leeuw CA, Sullivan PF, van Bochoven A, Visscher PM et al. Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nat Genet 2015; 47: 702–709.
OpenUrl CrossRef PubMed

[4] ↵
Stepniak B, Papiol S, Hammer C, Ramin A, Everts S, Hennig L et al. Accumulated environmental risk determining age at schizophrenia onset: a deep phenotyping-based study. The Lancet Psychiatry 2014; 1: 444–453.
OpenUrl

[5] ↵
Ripke S, Neale BM, Corvin A, Walters JTR, Farh K-H, Holmans PA et al. Biological insights from 108 schizophrenia-associated genetic loci. Nature 2014; 511: 421–427.
OpenUrl CrossRef PubMed Web of Science

[6] ↵
Bhatia G, Gusev A, Loh P, Vilhjálmsson BJ, Ripke S, PGC et al. Haplotypes of common SNPs can explain missing heritability of complex diseases. 2015 http://dx.doi.org/10.1101/022418.

[7] ↵
Ehrenreich, H; Mitjans, M; Van der Auwera, S; Centeno, TP; Begemann, M; Grabe, HJ; Bonn, S; Nave K-A. OTTO: a new strategy to extract mental disease-relevant combinations of GWAS hits from individuals. Mol Psychiatry 2017. doi:10.1038/mp.2016.208.
OpenUrl CrossRef

[8] ↵
Swanson CL, Gur RC, Bilker W, Petty RG, Gur RE. Premorbid educational attainment in schizophrenia: association with symptoms, functioning, and neurobehavioral measures. Biol Psychiatry 1998; 44: 739–747.
OpenUrl CrossRef PubMed Web of Science

[9] ↵
Kahn RS, Keefe RSE, JD H, B E, GM K, H D et al. Schizophrenia is a cognitive illness. JAMA Psychiatry 2013; 70: 1107.
OpenUrl

[10] ↵
Kraepelin E. Psychiatrie: Ein Lehrbuch für Studierende und Ärzte. 4th ed. Verlag von Johann Ambrosius Barth: Leipzig, Germany, 1893.

[11] ↵
Stefansson H, Meyer-Lindenberg A, Steinberg S, Magnusdottir B, Morgen K, Arnarsdottir S et al. CNVs conferring risk of autism or schizophrenia affect cognition in controls. Nature 2013; 505: 361–366.
OpenUrl

[12] ↵
Bulik-Sullivan B, Finucane HK, Anttila V, Gusev A, Day FR, Consortium R et al. An atlas of genetic correlations across human diseases and traits. Nat Genet 2015; 47: 1236–1241.
OpenUrl CrossRef PubMed

[13] ↵
Okbay A, Beauchamp JP, Fontana MA, Lee JJ, Pers TH, Rietveld CA et al. Genome-wide association study identifies 74 loci associated with educational attainment. Nature 2016; 533: 539–542.
OpenUrl CrossRef PubMed

[14] ↵
Ribbe K, Friedrichs H, Begemann M, Grube S, Papiol S, Kästner A et al. The cross-sectional GRAS sample: A comprehensive phenotypical data collection of schizophrenic patients. BMC Psychiatry 2010; 10: 91.
OpenUrl CrossRef PubMed

[15] ↵
Rietveld CA, Esko TT, Davies G, Pers TH, Turley PA, Benyamin B et al. Common genetic variants associated with cognitive performance identified using the proxy-phenotype method. Proc Natl Acad Sci U S A 2014; 111: 13790–13794.
OpenUrl Abstract/FREE Full Text

[16] ↵
Okbay A, Baselmans BML, Neve J-E De, Turley P, Nivard MG, Fontana MA et al. Genetic variants associated with subjective well-being, depressive symptoms, and neuroticism identified through genome-wide analyses. Nat Genet doi:10.1038/ng.3552.
OpenUrl CrossRef PubMed

[17] ↵
Goes FS, McGrath J, Avramopoulos D, Wolyniec P, Pirooznia M, Ruczinski I et al. Genome-wide association study of schizophrenia in Ashkenazi Jews. Am J Med Genet Part B Neuropsychiatr Genet 2015; 168: 649–659.
OpenUrl CrossRef

[18] ↵
Nieuwboer HA, Pool R, Dolan CV, Boomsma DI, Nivard MG. GWIS: Genome-wide inferred statistics for functions of multiple phenotypes. Am J Hum Genet 2016; 99: 917–927.
OpenUrl

[19] ↵
Hugh-Jones D, Verweij KJH, St. Pourcain B, Abdellaoui A. Assortative mating on educational attainment leads to genetic spousal resemblance for polygenic scores. Intelligence 2016; 59: 103–108.
OpenUrl

[20] ↵
Nordsletten AE, Larsson H, Crowley JJ, Almqvist C, Lichtenstein P, Mataix-Cols D et al. Patterns of nonrandom mating within and across 11 major psychiatric disorders. JAMA Psychiatry 2016; 73: 354.
OpenUrl

[21] ↵
Pardiñas AF, Holmans P, Pocklington AJ, Escott-Price V, Ripke S, Carrera N et al. Common schizophrenia alleles are enriched in mutation-intolerant genes and maintained by background selection. 2016 http://dx.doi.org/10.1101/068593.

[22] ↵
Le Hellard S, Wang Y, Witoelar A, Zuber V, Bettella F, Hugdahl K et al. Identification of gene loci that overlap between schizophrenia and educational attainment. Schizophr Bull 2016. doi:10.1093/schbul/sbw085.
OpenUrl CrossRef PubMed

[23] ↵
Wooldridge JM. Multiple Regression Analysis: Estimation. In: Introductory Econometrics: A Modern Approach. Cengage Learning, 2013, pp 70–76.

[24] ↵
So H-C, Fong PY, Chen RYL, Hui TCK, Ng MYM, Cherny SS et al. Identification of neuroglycan C and interacting partners as potential susceptibility genes for schizophrenia in a Southern Chinese population. Am J Med Genet Part B Neuropsychiatr Genet 2010; 153B: 103–113.
OpenUrl

[25] ↵
Arion D, Horváth S, Lewis DA, Mirnics K. Infragranular gene expression disturbances in the prefrontal cortex in schizophrenia: signature of altered neural development? Neurobiol Dis 2010; 37: 738–46.
OpenUrl CrossRef PubMed Web of Science

[26] ↵
Salih DAM, Rashid AJ, Colas D, de la Torre-Ubieta L, Zhu RP, Morgan AA et al. FoxO6 regulates memory consolidation and synaptic function. Genes Dev 2012; 26: 2780–2801.
OpenUrl Abstract/FREE Full Text

[27] ↵
Maiese K. FoxO Proteins in the Nervous System. Anal Cell Pathol 2015; 2015: 1–15.
OpenUrl

[28] ↵
Aruga J, Yokota N, Mikoshiba K. Human SLITRK family genes: genomic organization and expression profiling in normal brain and brain tumor tissue. Gene 2003; 315: 87–94.
OpenUrl CrossRef PubMed Web of Science

[29] ↵
Beaubien F, Raja R, Kennedy TE, Fournier AE, Cloutier J-F, Ichtchenko K et al. Slitrk1 is localized to excitatory synapses and promotes their development. Sci Rep 2016; 6: 27343.
OpenUrl

[30] ↵
Proenca CC, Gao KP, Shmelkov S V, Rafii S, Lee FS, Aruga J et al. Slitrks as emerging candidate genes involved in neuropsychiatric disorders. Trends Neurosci 2011; 34: 143–53.
OpenUrl CrossRef PubMed Web of Science

[31] ↵
Huttenlocher, R. P. Synapse elimination and plasticity in developing human cerebral cortex. Am J Ment Defic 1984; 88: 488–496.
OpenUrl PubMed Web of Science

[32] Purves D, Lichtman J. Elimination of synapses in the developing nervous system. Science (80-) 1980; 210: 153–157.
OpenUrl Abstract/FREE Full Text

[33] Yazici E, Bursalioglu FS, Aydin N, Yazici AB. Menarche, puberty and psychiatric disorders. Gynecol Endocrinol 2013; 29: 1055–1058.
OpenUrl

[34] ↵
Saugstad LF. Age at puberty and mental illness. Towards a neurodevelopmental aetiology of Kraepelin’s endogenous psychoses. Br J Psychiatry 1989; 155: 536–544.
OpenUrl Abstract/FREE Full Text

[35] Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 2015; 4: 7.
OpenUrl CrossRef PubMed

[36] Sheskin D. The binomial sign test for a single sample. In: Handbook of Parametric and Nonparametric Statistical Procedures. Taylor & Francis Group: Boca Raton, 2007, pp 289–311.

[37] Bulik-Sullivan BK, Loh P-R, Finucane HK, Ripke S, Yang J, Patterson N et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet 2015; 47: 291–295.
OpenUrl CrossRef PubMed

[38] Consortium C-DG of the PG. Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. Lancet 2013; 381: 1371–1379.
OpenUrl CrossRef PubMed Web of Science

[39] Pers TH, Karjalainen JM, Chan Y, Westra H-JH-J, Wood AR, Yang J et al. Biological interpretation of genome-wide association studies using predicted gene functions. Nat Commun 2015; 6: 5890.
OpenUrl CrossRef PubMed

[40] Frey BJ, Dueck D. Clustering by passing messages between data points. Science (80-) 2007; 315.

Genetics of educational attainment aid in identifying biological subcategories of schizophrenia

ABSTRACT

INTRODUCTION

RESULTS

Proxy-phenotype analyses

Bayesian credibility of the results

Prediction of future genome-wide significant loci for schizophrenia

LD-aware enrichment across different traits

Replication in the GRAS sample

Polygenic prediction of schizophrenia measures in the GRAS sample

Controlling for the genetic overlap between schizophrenia and bipolar disorder

Simulations of assortative mating

Biological annotations

DISCUSSION

AUTHOR CONTRIBUTIONS

COMPETING FINANCIAL INTERESTS

ONLINE METHODS

GWAS on educational attainment

GWAS on schizophrenia

Proxy-phenotype look-up

LD-aware enrichment across different traits

Phenotypic correlations

Replication and Bayesian credibility of results

Polygenic prediction of schizophrenia symptoms in the GRAS sample

Controlling for the genetic overlap between schizophrenia and bipolar disorder

Simulations of assortative mating

Biological annotation

Data availability

Code availability

ACKNOWLEDGMENTS

Footnotes

REFERENCES

Citation Manager Formats

Subject Area