Abstract
Autism is a heterogenous collection of disorders with a complex molecular underpinning. Evidence from post mortem studies using adult brains have identified atypical co-expression of genes which may occur during the prenatal period. Recent studies using induced pluripotent stem cells (iPSCs) generated from autistic individuals have suggested that prenatal development is a critical period for the emergence of pathophysiology associated with this condition. However, how early during development these differences emerge and whether such alterations can be seen across the development of multiple brain regions is unclear. In this study we investigated whether early prenatal stages of neurodevelopment differ between iPSCs generated from typically developing and autistic individuals. We specifically selected autistic individuals unrelated to each other with a heterogeneous genetic background and no known comorbidities, to probe for common molecular phenotypes. Differentiation of iPSCs towards a cortical lineage revealed abnormal cell fate acquisition from an early stage of development. Interestingly, abnormal differentiation occurred in the absence of alteration in cell proliferation during cortical differentiation, differing from previous studies. Moreover, these effects appeared specific for the acquisition of a cortical fate, as differentiation of iPSCs towards a midbrain lineage was not accompanied by differences in neurogenesis between typically developing and autism iPSC lines. RNA-sequencing on a subset of our cohort further revealed autism-specific signatures during cortical differentiation similar to that observed in post mortem studies, indicating a potential common biological mechanism. Together, these data suggest unique developmental differences associated with autism may establish at an early prenatal stage.
Introduction
Autism spectrum conditions (henceforth autism) are a genetically heterogeneous spectrum of neurodevelopmental conditions1–3 diagnosed on the basis of impaired social-communication, alongside unusually narrow and repetitive interests and activities4. It is known to affect several brain regions such as the primary sensory cortex, association and frontal cortex, and parietal-occipital circuits5, 6, as well as the medial prefrontal cortex, superior temporal sulcus, temporoparietal junction, amygdala, and fusiform gyrus7, 8. Based on clinical criteria, autism is classified into syndromic and non-syndromic forms. Individuals carrying single gene mutations, copy number variations and/or chromosomal abnormalities, in addition to an autism diagnosis are usually classified as ‘syndromic’9. Non-syndromic autism is characterized by individuals with a primary diagnosis of autism that is not associated with a single mutation in a high risk genetic variant9. The symptoms of autism cannot be detecting until twelve to eighteen months of age10, and apart from elevated steroidogenic activity in amniotic fluid previously reported by our group11, 12 there is a paucity of prenatal markers reliably associated with an autism diagnosis. This has resulted in the understanding that the autistic phenotype might start to appear after birth. However, an increasing number of pre-clinical studies indicate that perturbation during critical periods of development maybe key for the emergence of autism13. Moreover, autism post mortem studies using adult brains have identified putative prenatal gene expression pathways14, further suggesting that abnormal molecular and cellular events during gestation may contribute to autism.
Recent methodological advances in the field of induced pluripotent stem cell (iPSC) technology15–20 has made it possible to study prenatal cellular behaviour in autism in detail, something that was not possible using post mortem brains. IPSCs have similar abilities to embryonic stem cells to generate any tissue of the body. These can then be differentiated into neurons of various lineages15–17, 19, 20. As the neurons contain the same genetic information as the individual from whom it was derived, typical or autistic, its cellular behaviours is influenced by its genetic background. Using these methods, studies have shown significant anomalies in cellular/molecular behaviour during prenatal-equivalent periods of development in autistic individuals21–23. One such study generated iPSCs from autistic individuals who were comorbid for macrocephaly, and demonstrated: (1) atypical cortical differentiation and increased cell proliferation of cortical neural precursor cells (NPCs) from iPSCs, and (2) an imbalance in excitatory (glutamate-producing) and inhibitory (GABA-producing) receptor activity21. More recently, using the same collection of iPSCs, an acceleration in neuronal maturation was found to be dependent on early cortical NPC development, as circumventing this stage of cortical development by direct conversion of iPSCs into maturing neurons did not recapitulate these effects23. In another study, 3D neural cultures were generated from iPSCs generated from autistic participants also with macrocephaly; here an overproduction of GABAergic neurons was observed22. From these studies it is becoming clear that cellular and molecular phenotypes associated with autism likely start before birth, and possibly at a very early stage of brain development23, and that some phenotypes may be associated with macrocephaly. The main phenotypes that appeared in these studies suggested in an acceleration in neuronal maturation as well as anomaly in excitatory and inhibitory neuron population and function. This latter phenotype has also independently reported using magnetic resonance spectroscopy (MRS) of autistic individuals, where abnormalities in levels of excitatory glutamate and inhibitory GABA metabolites were found24, 25. However, it is of note that in these iPSC studies atypical neural differentiation in autism was associated with higher cell proliferation21–23. As autistic participants in these studies also had macrocephaly it is yet to be determined whether the observed abnormal development is due to this comorbidity. Nevertheless, altered gene expression network dynamics were also identified at early stages of development in these iPSCs23. However, as macrocephaly is present only in a subset of autistic individuals26 it is not known if abnormal development can be generalised to atypically developing individuals without macrocephaly. Moreover, whether the alteration in gene expression networks at an early stage are also accompanied by a difference in precursor population composition is unknown. Finally, as these studies have predominantly focused on the development of forebrain/cortical neurons, it is yet to be tested whether abnormal development can also be observed in neural precursors fated towards a different lineage.
In this study, we have selected a cohort of autistic individuals without macrocephaly and with heterogeneous genetic backgrounds to represent the wider autism population. In this collection of iPSCs, we have observed atypical cortical differentiation in autism lines, including a divergent cellular identity starting from early stages of development (equivalent to neural tube formation at ∼4 weeks gestation). Interestingly autism iPSCs did not display any abnormalities in cell proliferation. Furthermore, iPSCs were not compromised in their ability to generate precursors fated to become midbrain neural cells. Finally, we investigated whether atypical cortical differentiation was autism-specific by comparing it with observations made in post mortem studies, and further checked that the phenotypes we observed were not a result of non-specific effects.
Study participants, Materials and Methods
Induced pluripotent stem cells
IPSCs were produced from keratinocytes from plucked hair follicles using a previously established protocol27 (Supplementary Figure S1). 9 autistic individuals – including six non-syndromic autistic individuals (ASDM1, 004ASM, 010ASM, 026ASM, 132ASM, 289ASM), one individual with 3p deletion syndrome (245ASM) and two autistic individuals with a mutation in the NRXN1 gene (109NXM, 092NXF), and 6 typically developing individuals were chosen (Supplementary Table S1, S2). All autistic individuals had a primary diagnosis of autism. For clinical diagnosis of autistic individuals included in our RNA-sequencing based gene network analyses, see Table S1. Participants were recruited for this study under approval by NHS Research Ethics Committee (REC No 13/LO/1218); informed consent and methods were carried out in accordance to REC No 13/LO/1218.
Neuronal differentiation
We differentiated iPSC lines into cortical neurons using a well-established method based on dual SMAD inhibition; this results in the recapitulation of key hallmarks of corticogenesis and the generation of cortical neurons17, 28. iPSCs were differentiated till early neuron stage – day 35 (Figure 1A) (see extended experimental procedures). iPSCs were also differentiated to midbrain floorplate precursors using established protocols19, 20.
Immunocytochemistry
Cortical differentiation of all aforementioned autism and control iPSCs were characterised using immunocytochemistry. iPSCs were differentiated towards a cortical lineage till day 9, day 21 and day 35 and tagged with antibodies of appropriate markers associated with each developmental stage (see extended experimental procedures). iPSCs were differentiated towards midbrain floor plate lineage till day 11. Nuclei were stained using DAPI, and imaging was performed using a 40× objective on a Leica SP5 confocal microscope (Figure 1B). High throughput confocal imaging was performed at day 9, day 21 and day 35 of cortical differentiation and at day 11 of midbrain differentiation on the Opera Phenix High-Content Screening System (Perkin Elmer), and cell type analysis was performed using the Harmony High Content Imaging and Analysis Software (Perkin Elmer).
EdU labelling
IPSCs at day 0, day 9, day 21 and day 35 stages of differentiation were labelled with EdU (5-ethynyl-2’-deoxyuridine) using the Click-iT EdU Assay (Invitrogen). Cells were incubated with EdU for 4 hours at 37°C, then an additional 4 hours with EdU-free media. After incubation, labelled cells were fixed and prepared for detection using the Click-iT reaction cocktail. Nuclei were labelled using Hoechst 33342. EdU was detected with Alexa Fluor 488. Number of EdU-labelled cells were recorded as a percentage over total number of live nuclei. Imaging and analysis were done using the Opera Phenix High-Content Screening System (Perkin Elmer) and the Harmony High Content Imaging and Analysis Software (Perkin Elmer) respectively.
Statistics
Quantification of cell types was performed using the Harmony High Content Imaging and Analysis Software (Perkin Elmer). Percentage of cells positive for desired marker (probed for Pax6, Tuj1, EMX1 or Gad67) versus total number of live cells (probed using DAPI) was calculated. To take into consideration variability associated with iPSC differentiation, 8 independent experimental replicates of 2 clones per individual was used at every stage. Care was taken to ensure that fluorescence intensity was calculated only from known intracellular location of markers (Pax6, EMX1: nucleus; Tuj1, Gad67: cytoplasm). Appropriate threshold of fluorescent intensity was selected, and maintained throughout the experiment for each probe (Supplementary Table S3). Independent 2-group t-test was used to check significant difference between autism and control using p-value ≤ 0.05. One way ANOVA was performed to investigate in-group variance. All statistical analysis was performed on R statistical software.
RNA-sequencing
RNA-sequencing was performed from a subset of our cohort, on 2 clones from each participant (ASDM1, 004ASM, 010ASM, CTRM1, CTRM2, CTRM3), and each clone had 2 technical replicates. Starting with 500ng of total RNA, poly(A) containing mRNA was purified and libraries were prepared using TruSeq Stranded mRNA kit (Illumina). Unstranded libraries with a mean fragment size of 150bp (range 100-300bp) were constructed, and underwent 50bp single ended sequencing on an Illumina HiSeq 2500 machine. Reads were mapped to the human genome GRCh37.75 (UCSC version hg19) using STAR: RNA-seq aligner29. Quality control was performed using Picard tools (Broad Institute) and QoRTs30. Gene expression levels were quantified using an union exon model with HTSeq31.
Differential gene expression (DGE)
DGE analysis was performed using R statistical packages32 with gene expression levels adjusted for gene length, library size, and G+C content (henceforth referred to as “Normalized FPKM”). A linear mixed effects model framework was used to assess differential expression in log2(Normalized FPKM). Autism diagnosis was treated as a fixed effect, while also using technical covariates accounting for RNA quality, library preparation, and batch effects as fixed effects in this model.
Weighted gene coexpression network analysis
The R package weighted gene coexpression network analysis (WGCNA)33 was used to construct coexpression networks as previously shown14. Biweight midcorrelation was used to assess correlations between log2(Normalized FPKM). For module-trait analysis, 1st principal component of each module (eigengene) was related to an autism diagnosis in a linear mixed effects framework as above, replacing the expression values of each gene with the eigengene.
Gene sets
A SFARI autism associated gene set was compiled using the online SFARI gene database, AutDB, using “Gene Score” as shown previously14. We obtained dev_asdM2, dev_asdM3, dev_asdM13, dev_asdM16 and dev_asdM17 modules from an independent transcriptome analysis study using RNA-sequencing data from post mortem early developing brains14. Modules asdM12 and asdM16 were obtained from an autism post mortem gene expression study34. We obtained another three autism-associated modules: ACP_asdM5, dev_asdM13, ACP_asdM14 from an independent gene expression study profiling dysregulated cortical patterning genes in autism post mortem brain35. These gene sets were used to undertake enrichment analysis. Gene modules from schizophrenia post mortem studies were used as non-autism neuropsychiatric gene modules36, 37. All previously published gene modules were established using WGCNA. The Cancer Gene Census (CGC) was used to establish non-neuropsychiatric gene enrichment.
Gene set overrepresentation analysis
Enrichment analyses were performed either with logistic regression. All GO enrichment analysis to characterize gene modules was performed using GO Elite38 with 10,000 permutations. Molecular function and biological process terms were used for display purposes.
Code availability
Computer code used in our analyses is available from the authors upon request.
Results
Participant overview
Of the nine autistic participants in this study, eight were male, with one female (Supplementary Table S1). Six participants were diagnosed as having non-syndromic autism, while three participants were diagnosed with syndromic autism. All the autistic individuals had genetically heterogeneous backgrounds and were also genetically unrelated. Two non-syndromic participants had deletion type CNVs in the 1p21.3 and 8q21.12 regions respectively, with DYPD and PTBP2 being autism-associated genes affected in the former, while the latter also having a deletion in the AXL gene (Supplementary Table S2). We also detected a stop-gain mutation in the SHANK3 gene of another non-syndromic participant from exome analysis (Supplementary Table S2). Of the three syndromic participants, two syndromic participants had deletion type CNVs in the 2p16.3 region, affecting the NRXN1 gene, a well-established autism-associated gene, while the third carried a deletion in the 3p chromosomal region, which includes the CONTACTIN4 gene (CNTN4)39 (Supplementary Table S2). IPSCs were generated from follicular keratinocytes from each participant and reprogrammed using established methods18.
Autism iPSCs diverge from typical development at precursor cell stages during cortical differentiation
Multiple studies have implicated anomalies in cortical regions associated with autism5–8. Adult post-mortem brain studies have indicated altered cortical development14, 34, 35. Studies using iPSCs from autistic individuals with macrocephaly have suggested atypical cortical differentiation in autism associated with altered cell proliferation21, 22, 40. As the cerebral cortex is the most strongly identified region of the brain affected in autism, we differentiated our iPSCs towards a cortical lineage using a previously established protocol17, 28. This differentiation protocol recapitulates formation of neuroectoderm and directs neural precursors to form primarily dorsal forebrain cells17, 28. As we were also interested in determining divergence between control and autism iPSCs, we took advantage of this protocol to focus on three distinct developmental stages (Figure 1A): (1) Day 9: early neural precursor stage, when stem cells form new precursor cells which self-organise into neural tube-like structures known as neural rosettes with a directional apical-basal arrangement; (2) Day 21: late neural precursor stage, a period during which neural precursor cells begin forming layers from the apical surface and are primed for differentiation into neurons as they move outwards; and (3) Day 35: immature cortical neurons, a stage at which precursors become post-mitotic and adopt a deep layer neuronal identity (Figure 1B).
Although there is evidence of divergence between control and autism iPSCs during development, it has been suggested that this divergence was a consequence of altered cell proliferation associated with macrocephaly21. In order to determine if these divergences still hold true in the wider autism spectrum, we selected our cohort with a heterogeneous genetic background and not having macrocephaly. First, we asked whether iPSCs from control and autistic participants displayed differences in the expression of neuronal differentiation markers Pax6 and Tuj1 in early and late neural precursor cell stages. Pax6 is a robust marker for neural precursors of cortical lineage41, while Tuj1 is a robust pan-neuronal and neural precursor marker42. It is to be noted that the cortical differentiation protocol used in this study produces variability in expression of neuronal markers. This phenomenon has been associated with stochastic fluctuations in activation of key transcription factors during cortical differentiation43. To account for this, we generated 8 independent experimental replicates from 2 clones per participant at every stage and recorded the average effect for each participant. In control precursor cells on day 9, Pax6 (Pax6 Control: 90%) and Tuj1 (Tuj1 Control: 59%) was expressed in majority of cells. On day 21, both markers remained highly expressed (Pax6; Control: 91%, Tuj1; Control: 78%) which was consistent with cortical differentiation from control iPSCs (Figure 1C-E, Table 1). In contrast, examination of autism precursor cells on day 9 revealed significantly lower Pax6 and Tuj1 expression compared to controls (Pax6 Autism: 34%. Tuj1 Autism: 20%. p≤0.05) (Figure 1C-E, Table 1). Moreover, at day 21, Pax6 and Tuj1 expression in autism precursor cells was still significantly lower than controls; however, the extent to which these cells were different to control precursors was not as pronounced as at day 9 (Pax6 Control: 91%, Autism: 72%; p≤0.05. Tuj1; Control: 78%, Autism: 64%. p≤0.05) (Figure 1C-E, Table 1). As expected, variability was observed throughout the differentiation protocol between experimental replicates. However, this variability was more pronounced in the autism iPSCs. ANOVA analysis revealed greater overall spread of data points and higher F-values in a majority of observed parameters during autism iPSC differentiation (Supplementary Figure S2B, S2C). Taken together, these data suggest that autism and control iPSCs show divergent neural cell identity during cortical differentiation which manifests during early stages but remains significant even at later stages of differentiation.
Control and autism iPSCs show similar proliferative capacity during cortical differentiation and negligible differences while generating midbrain floorplate precursor cells
Altered proliferation of neural precursors associated with macrocephaly in autistic individuals has been linked to divergent cortical differentiation in autism iPSCs21, 22, 40. To investigate if differences we observed were associated with atypical cell proliferation, we EdU (5-ethynyl-2’-deoxyuridine; a nucleoside analogue of thymidine) labelled44 developing control and autism iPSCs and examined rate of proliferation during cortical differentiation. Specifically, we examined proliferation in iPSC (day 0), early (day 9) and late (day 21) precursors as well as in day 35 immature neurons (Figure 2A). Control iPSCs displayed an overall reduction in cell proliferation as their mitotic capacity was reduced during differentiation, a phenomenon predicted to occur during the process of cellular differentiation from stem cells45 (Figure 2A). Intriguingly, the proliferation capacities of autism iPSCs was similarly reduced at each stage of cortical differentiation examined (Figure 2A). These data indicate that the observed differences in the expression of Pax6 and Tuj1 were not due to altered levels of proliferation between control and autism iPSCs.
Next, we wanted to rule out the possibility that the cellular identity differences observed during cortical differentiation were not due to any inherent abnormality associated with our autism iPSCs. We reasoned that no significant differences would be observed between control and autism iPSCs when differentiated towards a midbrain neuronal lineage, a region of the brain not known to be directly linked to the cognitive deficits associated with autism46, 47. Therefore, we differentiated control and autism iPSCs towards a midbrain floorplate lineage using established protocols19, 20. Here we differentiated iPSCs for 11 days, a stage at which majority of cells take on a midbrain floorplate precursor (mFPP) identity. Interestingly, nearly 100% of mFPPs from both control and autism iPSCs were positive for LMX1A an essential transcription factor required for defining a midbrain identity48 (Figure 2B, C). Similarly, expression of the transcription factor FOXA2, which positively regulates neurogenic factors in dopaminergic precursor cells49, also did not differ significantly between control and autism mFPPs. Taken together, these data indicated that autism iPSCs did not show midbrain developmental differences, and that their atypical behaviour was specific to cortical differentiation.
As our data indicates that autism iPSCs show abnormality in neural identity specifically of a cortical lineage, we wanted to know if there was a difference in the capacity of these iPSCs to generate neurons. We differentiated the iPSCs up until day 35, when immature cortical neurons start to appear. Cells at this stage primarily express TBR1, a transcription factor expressed in early born neurons17, 50. In our study, we found similar levels of TBR1 positive cells in both control and autism groups (Figure 2D, E). In addition, cells from the autism group did not demonstrate greater variability in TBR1 expression compared to the controls (F-values: control = 53.68, autism = 30.08; p<0.05). These data suggest that whilst autism precursor cells displayed differences in cortical-fate proteins such as Pax6 and Tuj1, this did not translate into significant differences in their capacity to differentiate into immature neurons.
Altered development of forebrain precursor lineages in autism iPSCs
Early differences in cortical fate protein expression did not appear to affect cellular identity of immature neurons from autism iPSCs. Thus, while cortical differentiation generated precursors of the cortical lineage from both groups, we wanted to find out whether the precursors had identical neuronal identities or if they generated different cortical subtypes. We were particularly interested in investigating the development of the two major neuronal subtypes; namely dorsal forebrain and ventral forebrain neurons which are known to give rise to excitatory and inhibitory neurons respectively in the mature state51, 52. Specifically, dorsal forebrain precursors are known to give rise to cortical excitatory, glutamatergic neurons whereas ventral forebrain precursors give rise into inhibitory GABAergic neurons51, 52. The appearance of excitatory-inhibitory imbalance is a widely reported phenotype associated with autism24, 25. Studies by both Mariani et al., (2015) and Marchetto et al., (2017) have also demonstrated an imbalance in the ratio of excitatory and inhibitory neural precursors and neurons in 2D and 3D cultures differentiated from iPSCs from autistic individuals. However, in their studies the autistic participants also had macrocephaly and this imbalance was attributed to alterations in proliferative capacity associated with macrocephaly. As we did not observe an alteration in the proliferation in neural precursors from our autism-iPSCs, we hypothesised that the appearance of excitatory-inhibitory imbalance could also be a result of an early divergence in precursor development. We looked at Emx1 and Gad67 expression in neural precursors at day 9 and day 21, and immature neurons at day 35 of differentiation (Figure 3, Table 1). Emx1 is expressed in dorsal forebrain precursors52–54, while Gad67 is the rate limiting enzyme in the GABA synthesis pathway and expressed in ventral forebrain precursors55, 56. At day 9, Emx1 expression was significantly higher in control compared to autism neural precursors, even though it was highly expressed in both controls as well as autism precursors (Emx1; Control: 95%, Autism: 80%; p≤0.05) (Figure 3B, C). At day 21, Emx1 expression in both groups appeared to remain stable, with only a slight reduction in control precursors compared to day 9. At this stage in the control group expression of Emx1 was still significantly higher (Emx1; Control: 90%, Autism: 81%; p≤0.05) (Figure 3B, C). In day 35 immature neurons, Emx1 expression in both control and autism neurons was reduced compared to day 9 and day 21 precursors; however the reduction was significantly more acute in the autism group (Emx1; Control: 72%, Autism: 50%; p≤0.05) (Figure 3B, C). Gad67 expression in autism and controls followed a different trajectory. At day 9, the expression of Gad67 was significantly higher in the control precursors, while there was negligible expression in the autism precursors (Gad67; Control: 35%, Autism: 4%; p≤0.05) (Figure 3B, C). At day 21, there was reduction of Gad67 expression in the control group, but significant increase of Gad67 in the autism group (Gad67; Control: 26%, Autism: 27%; p>0.05) (Figure 3B, C). At this stage, both control and autism precursors had similar Gad67 expression with no statistically significant differences between groups. By day 35, Gad67 expression in autism precursors overtook Gad67 expression in controls, and its expression in control precursors was further reduced (Gad67; Control: 17%, Autism: 48%; p>0.05) (Figure 3B, C). Similar to what we observed with Pax6 and Tuj1 expressing cells, Emx1 and Gad67 expression also showed conspicuous variability. Again, ANOVA revealed greater variability in majority of the parameters in the autism group (Supplementary Figure S2A, S2C). When syndromic and nonsyndromic autistic individuals were analysed separately, both groups still appeared to follow similar trends when compared to the control group (Supplementary Figure S3). Taken together, these data showed significant differences in the determination of neuronal subtype identity of cortical lineage, from control versus autism iPSCs. It was evident that autism precursor cells had a greater propensity to differentiate into neural cells of ventral forebrain lineage compared to controls, and that this effect was not a result of atypical cell proliferation. Overall, these data also showed increased variability in expression of key cortical differentiation genes in autism iPSCs. This phenomenon that was considerably reduced during midbrain differentiation. However, the variability did not appreciably affect the mean group differences between control and autism (Figure 1E, 3C).
Neurodevelopmental gene expression signatures in autism-iPSC-derived neurons
Based on our cellular analyses, we found two distinct phenotypes associated with cortical differentiation of autism iPSCs: (1) they show atypical neural differentiation of cortical lineage cells, (2) atypical differentiation into cortical precursor subtypes of dorsal forebrain and ventral forebrain fates. To assess whether the observed atypical neural differentiation was specific to autism, we undertook RNA-sequencing on day 35 cortically differentiated immature neurons derived from a subset of our autism- and control-iPSC cohort. Our aim was to identify global gene expression patterns and compare them with gene expression patterns reported in post-mortem brains. We undertook RNA-sequencing, then identified gene co-expression pathways which were then used to estimate preservation in gene co-expression pathways in post-mortem brains. This gave us a reliable estimate of autism-specific gene expression in our iPSC-derived neurons and showed recapitulation of gene expression patterns in iPSC-derived neural cells. To our knowledge, a similar comparison had not been conducted before. We ensured that our analysis pipeline was consistent with those used to investigate transcriptomic changes in autism post-mortem brain studies (Figure 4A, Figure S4). Using these analysis methods, we established: (i) networks of genes dysregulated in autism iPSC-immature neurons and, (ii) the degree to which gene networks were dysregulated in post-mortem samples. Differential expression from RNA-sequencing data revealed control and autism samples clustering hierarchically into two separate groups based on gene expression patterns and principal component analysis (Figure 4B, Supplementary Figure S5A, S5B). To identify co-expressed genes, signed weighted gene coexpression network analysis (WGCNA) was performed. We identified 11 co-expression modules significantly associated with autism (labelled according to R-assigned colours, e.g., salmon, Figure 4C). We ranked the modules according to their module eigengene values (ME; the first principal component of the module) (Figure 4C, D Supplementary Figure S6A). Of the 11 modules, 5 modules were positively correlated, and 6 modules were negatively correlated in autism-iPSC immature neurons. The modules were assigned consensus functions based on gene ontology (GO) terms (Supplementary Figure S7A). The top 3 positively correlated modules having higher MEs in autism were, ‘steelblue’ (Cellular Metabolic Processes), ‘lightgreen’ (Neural Development) and ‘white’ (Immune Activation), and the top 3 negatively correlated modules having lower MEs in autism were, ‘grey60’ (Epigenetic Regulation), ‘salmon’ (Gene Regulation) and ‘sienna3’ (Chromosome Organisation) (Figure 4E, F). The ‘steelblue’ module was enriched for GO terms for metabolic functions associated with atypical cell proliferation (Figure 4G). The ‘lightgreen’ module was enriched for neural development functions including regulation of cell-cell adhesion, cognition, calcium mediated signalling and regulation of dendrite maturation associated with neural development. The ‘white’ module was enriched for non-neuronal functions such as cytokine binding, regulation of DNA damage response, positive regulation of apoptosis and negative regulation of neuronal death (Figure 4I). The ‘salmon’ module was enriched for epigenetic gene regulatory functions such as RNA methyltransferase activity, and s-adenosylmethionine-dependant methyltransferase activity (Figure 4J). Finally, the ‘sienna3’ module was enriched for cellular gene regulatory functions such as nucleic acid binding, regulation of RNA metabolic process and regulation of gene expression (Figure 4K), while the ‘grey60’ module was enriched for chromosome organisation functions such as regulation of histone H3-K4 methylation, DNA binding and chromosome organisation (Figure 4L).
Gene co-expression modules in autism-iPSC immature neurons show significant correlation to co-expression modules from autism post-mortem brains
We used autism post-mortem gene modules from three studies to test autism gene network preservation in iPSC neurons14, 34, 35. Additionally, we used a set of 155 autism associated candidate genes from a previous study14 using the Simons Foundation Autism Research Initiative (SFARI) database to identify effect of high impact autism associated genes in our modules. The SFARI list of autism genes is a database of genes collated according to the type of genetic variations from whole genome sequencing studies, rare genetic mutations and mutations causing syndromic forms of autism. It was first published in 200957 and an up-to-date reference for all known associated genes can be found at: https://gene.sfari.org/autdb/HG_Home.do. We mapped the SFARI autism associated genes with our gene modules and found them to be enriched in the negatively correlated ‘salmon’ (epigenetic gene regulation) module (p=0.002; odds ratio [OR] = 1.5) (Figure 5A). High impact SFARI genes such as cytoskeletal ANK2 (gene score: 1) and regulatory ARID1B (gene score: 1; syndromic) were negatively enriched in this module and could be potential loss-of-function risk genes in our study. We then mapped 5 autism-associated developmental gene modules dysregulated in post mortem brains (APMB), from Parikshak et al., (2013) (dev_asdM2, dev_asdM3, dev_asdM13, dev_asdM16, dev_asdM17)14, shown in Figure 5A. Of these 5 sets, dev_asdM2 and dev_asdM3 were downregulated in autism and represent DNA-binding and transcriptional regulation, while dev_asdM13, dev_asdM16 and dev_asdM17 were upregulated in autism and represent later phase neuronal functions and development of synaptic structure. The downregulated dev_asdM2 set was enriched in the top downregulated genes (‘Top –ve DE’, p = 2×10-4; OR = 1.8), as well as ‘grey60’ (p = 0.004; OR = 1.6) and ‘sienna3’ (p = 10-5; OR = 2) which were downregulated modules. Similarly, the dev_asdM3 set was enriched in the top downregulated genes (‘Top –ve DE’, p = 0.008; OR = 1.5), and ‘grey60’ (p = 3×10-14; OR = 2.5) and ‘sienna3’ (p = 4×10-4; OR = 1.7) modules. The upregulated dev_asdM13 set was enriched in the top upregulated genes (‘Top +ve DE’, p = 10-6; OR = 2.1), and ‘lightgreen’ (p = 3×10-9; OR = 3.1) and ‘white’ (p = 10-6; OR = 2.3) which were upregulated modules. Similarly, the dev_asdM16 set was enriched in the ‘lightgreen’ module (p = 10-4; OR = 2.6), and, the dev_asdM17 set was enriched in the top upregulated genes (‘Top +ve DE’, p = 0.002; OR = 1.7) and the ‘lightgreen’ module (p = 0.002; OR = 1.9). We then mapped two gene modules upregulated in the temporal and frontal cortex of the adult autism brain – APMB_asdM12 (a synaptic function module) and APMB_asdM16 (an immune module) from Voineagu et al., (2011) 34 (Figure 5A). The APMB_asdM12 module was enriched in the upregulated ‘white’ module (p = 0.04; OR = 1.8), while the APMB_asdM16 was enriched in the top upregulated genes (‘Top +ve DE’, p = 5×10-6; OR = 2.6) and the ‘white’ module (p = 6×10-5; OR = 2.7) (Figure 5A). Gene sets associated with attenuated cortical patterning or ACP 35 were also mapped (Figure 5A) and suggested greater prediction of ACP in autism iPSC neural cells. One additional aspect of this analysis was that it indicated specificity of our module enrichment. Only the downregulated modules were enriched in downregulated post-mortem modules, while the upregulated modules only enriched in the upregulated post-mortem modules. This mutual exclusive clustering of enrichment suggested recapitulation of autism-associated gene expression pathways in our cohort (Figure 5A). We found even more evidence of specificity our gene expression modules. When correlated to schizophrenia and cancer gene modules, enrichment was poor and non-specific (Figure 5B). A module preservation analysis also revealed moderate to high preservation of our gene modules in similarly labelled gene modules from autism iPSC studies21, 22 (Supplementary Figure S7B). However, we did not find significant correlation of differential gene expression with single nucleotide polymorphism (SNP) load in the exomes of the individual participants (Supplementary Figure S8A, S8B).
Discussion
The overarching aim of this study was to identify cellular identities during early stages of cortical development from control and autism iPSCs, and investigate specificity of these phenotypes in autism. We wanted to study this in a heterogeneous cohort of autistic individuals who did not have macrocephaly. To this end, here we have selected a cohort comprised of 6 autistic individuals with uncharacterised genetic background and 3 autistic individuals with known CNVs in high autism risk loci. First we found that during cortical differentiation control and autism iPSCs showed significant differences in development of early neural precursor cells. This effect persisted at a late precursor cell stage although to a lesser degree. Interestingly, we did not find proliferative capacities between control and autism to be significantly altered. This observation was contrary to previous reports that atypical proliferation was the primary cause for altered neurogenesis from autism iPSCs21–23. This also suggested a potentially different mechanism was involved in producing the divergent precursor populations. When looking at cortical neuron subtypes, we found a divergence in the development of dorsal forebrain or excitatory precursors and ventral forebrain or inhibitory precursors from an early stage of development. On comparing transcriptomes between control and autism cortical neurons, we found that transcriptomic signatures in autism iPSC-derived neurons were highly correlated to those identified in post mortem brain studies. This showed autistic iPSCs when cortically differentiated, was significantly different from control iPSCs, and that based on gene expression signatures, these differences were specific to autism.
Landmark autism iPSC studies have demonstrated imbalance in excitatory versus inhibitory neural signalling in neurons21 and cerebral organoids22. More recent work using these autism lines have highlighted early neurodevelopment as being key for the convergence of molecular pathologies in autism23. Building on this body of evidence we examined whether there were differences in cell identity during early neural differentiation starting from a stage equivalent to neural tube formation. When iPSCs were differentiated towards a cortical fate, we found significant differences in the number of cells expressing Pax6 and Tuj1 in autism compared to controls, with both Pax6 and Tuj1 levels being lower in autism neural precursors. As Pax6 is an essential transcription factor that determines the cortical identity of neural precursors58, fewer Pax6 expressing cells from autism iPSCs may signify divergent differentiating behaviour. Fewer number of Tuj1 expressing cells were also observed during autism cortical differentiation, and as Tuj1 is expressed in all neurons and precursors this was consistent with the Pax6 expression. The differences in Pax6 and Tuj1 expression was more pronounced at the early precursor stage (day 9) than at the later precursor stage (day 21) when Pax6 and Tuj1 levels in autism cells appeared to be ‘catching up’. Interestingly, at day 35 immature neuron stage, there was negligible differences between control and autism neurons expressing TBR1, a transcription factor expressed in early-born post-mitotic neurons17, 59. This implies that despite differences in the expression of Pax6 and Tuj1 in early and late precursor cells, there appears to be no alteration in the number of cortical neurons being generated during differentiation.
The iPSC studies that showed altered neurodevelopment to be associated with autism, selected autistic participants based on a comorbid diagnosis of macrocephaly21–23, 40. According to these studies, increased cell proliferation associated with macrocephaly was the cause for dysregulation of neurodevelopment and GABAergic versus glutamatergic neuron balance21, 22. However, not all autistic individuals have macrocephaly, and we wanted to study a more heterogeneous cohort of autistic individuals to find out if the assertions made in previous studies still held true in the wider autistic population. As a result we did not use macrocephaly as a criteria for recruitment. As expected, we did not find any differences in proliferative capacities between control and autism iPSCs in our study. This suggested that divergent cortical differentiation could be result of altered cell fate rather than altered proliferation of certain cell types in autism.
Due to Pax6 and Tuj1 expression in differentiating autism cells ‘catching up’ at day 21, and negligible differences in TBR1 expression at day 35, we wanted to investigate whether early differences in cortical differentiation had any effect on the neural subtype identities of cortical precursors. To this end, we looked at the appearance of precursors expressing markers for two major types of cortical neurons, EMX1 (expressed in precursors of the dorsal forebrain, and adult excitatory neurons) and Gad67 (expressed in precursors of the ventral forebrain, and adult inhibitory neurons). We found significant differences between control and autism in cells expressing EMX1 and Gad67. We also found cells expressing Gad67 to be decreasing over time in controls while increasing in autism. Higher levels of Gad67 cells in autism day 35 immature neurons in our study was similar to higher levels of Gad1/Gad67 cells observed in autism iPSC-derived organoids22. The atypical alteration in Emx1 and Gad67 expressing cells could have consequences on ratio of dorsal and ventral forebrain neurons at adulthood when they have an excitatory and inhibitory function respectively, and could be a possible prenatal origin of the excitatory/inhibitory imbalance phenotype of autism21, 22, 24, 25. As we did not observe any differences in proliferative capacity between control and autism iPSCs, it is possible that the physiological differences associated with altered excitatory-inhibitory cell types were independent of the proliferative capacity of iPSCs, and instead a result of divergent cell fate at early precursor stages.
An important question that arose in our study was whether atypical maturation of autism iPSCs was specific to cortically differentiating cells. To investigate if this was a more general, widespread, aberration impacting the development of multiple brain regions, we differentiated control and autism iPSCs into midbrain precursors. The different neuronal lineage of midbrain floor plate precursors (mFPPs) meant iPSCs were exposed to a different set of patterning factors19, 20. This also allowed us to determine whether there was any inherent abnormality in the way our autism iPSCs were responding to patterning factors used in neural differentiation protocols. Surprisingly, differentiating iPSCs into mFPPs revealed negligible differences in between control and autism iPSCs. This was consistent with the fact that it is cortical brain regions such as the prefrontal cortex that are primarily affected in autism causing the deficits of cognitive functions such as social skills and communication, stereotyped and repetitive behaviour – the phenotypes which form the basis for its diagnosis47, while lower brain regions though partially associated with autism spectrum conditions, their roles are not as well defined46, 47.
Through our studies we also tried to account for a major caveat of working with iPSCs as a research model, i.e., variability. Indeed it is important to acknowledge the reported variance in cellular behaviour which is characteristic during cortical differentiation of iPSCs43. Such variance could mask the observed phenotypes or be indicative of an issue with the iPSCs. In order to account for these possibilities, we have taken a number of technical steps to ameliorate this. First, we have used 2 iPSC clones per participant which were generated from 9 atypically developing autistic individuals in all cellular assays to account for both intra- and inter-donor variability. Second, we undertook 8 experimental replicates from each iPSC clone to account for experimental variance due to batch differences. We analysed the variance seen within, in addition to across, all lines, in order to assess whether cellular phenotypes observed were robust for each participant. We found that overall variance was greater in the autism group in the majority of cellular parameters measured compared to controls. As this variance was within the expected variability seen during cortical differentiation, this suggested the observed cellular phenotype were associated with autism and not due to variability in the protocol. Importantly, differentiation of the same iPSCs into mFPPs was not accompanied by increased variability as seen during cortical differentiation. To understand the relevance of this variability, this phenomenon will need to be studied further. Interestingly, it has been reported that during cortical differentiation precursor cells undergo through a multitude of ‘microstates’ which might be a result of stochastic fluctuations in activation of key transcription factors43. It would be interesting to determine whether the activity of these transcription factors are altered in autistic iPSCs, introducing abnormalities during the transition between microstates.
The aberrations we observed in our heterogeneous pool of autism iPSCs appeared to suggest a common set of prenatal cellular mechanisms associated with the condition. We then investigated signatures in autism iPSC immature neurons that were specific to autism, and quantify these signatures with the help of those reported in post mortem studies14, 34, 35, which were performed on large cohorts having heterogeneous genetic backgrounds. In our study, we selected a sub-cohort of autistic and control participants to differentiate into neural cells and undertake a similarly designed gene expression analysis as the post mortem studies. We first identified both positively and negatively expressed gene modules, then ran enrichment analyses with positively and negatively gene modules reported in the post-mortem studies. We found positive modules in our study to be enriched in positive modules in post-mortem brains, while negative modules to be enriched in negative modules in post-mortem brains. This meant that gene networks that were being dysregulated in autism iPSC-derived immature neurons were also being dysregulated in autism post-mortem brains. We found even more evidence that our gene expression modules were autism-specific, when we found enrichment analyses with post mortem schizophrenia gene modules and the cancer gene census did not yield disease-specific enrichment with our gene modules.
To conclude, the present study demonstrates significant differences between cortical differentiation of iPSCs from autistic individuals and typical controls. Not only did we find differences in development of precursor cells fated towards cortical neurons, we also found differences in development of dorsal and ventral forebrain precursors and immature neurons. However, no significant differences were found in development of midbrain precursors, indicating that autism associated atypical neural identity was limited to the cortical lineage. In addition, we did not find altered cellular proliferation in autism iPSCs, suggesting atypical neural identity was due to cell fate mechanisms rather than proliferation. To our knowledge, this is the first study to show aberrations specifically in cortical differentiation occurring at a developmental stage as early as neural tube formation (equivalent to ∼4 weeks of gestation). We also observed high variability within experimental replicates during cortical differentiation but not midbrain differentiation, with variability increasing in cortically differentiating autism iPSCs. RNA-sequencing revealed cortical differentiation from autism iPSCs had autism-specific gene expression signatures. Future studies will reveal if the phenotypes we observed are a result of dysregulation of pathways responsible for early cortical development, such as Wnt and Notch signalling21, and will reveal if stabilising these pathways at an early stage of differentiation can help recover autism-associated dorsal/ventral forebrain precursor fates. If precursor fates are responsible for later imbalances in excitatory and inhibitory cell types in the brain, early interventions to stabilise signalling pathways might be able to ameliorate this effect.
Ethics, consent and permissions
Informed consent from participants have been taken before recruitment: Patient iPSCs for Neurodevelopmental Disorders (PiNDs) study’ (REC No 13/LO/1218).
Consent to publish
We have obtained consent to publish from the participant to report individual patient data.
Availability of data and materials
Sequence data have been uploaded on synapse.org. Synapse ID: syn8118403, DOI: doi:10.7303/syn8118403
Authors’ contribution
DA, JP, JC, DPS, SBC conceived the study and wrote the first draft. VS, DHG conceived and developed bioinformatics analysis framework and analysis. DA, RN, LD, PN, CS, KJ responsible for sample preparation and data analysis. GM was responsible for ethics application. GM, MAM, JH, IL, DS, EL, DH, FAF and DM responsible for recruiting and collecting hair samples from individuals with autism and controls. DPS and SBC oversaw the running of the project. All co-authors contributed to study concept, design, and writing of the manuscript. All authors read and approved the final manuscript.
Supplementary Info
Supplementary methods
Study participants and neuronal differentiation
Keratinocytes were collected from autistic participants, and typical controls without an autism diagnosis (Ethics approved, 13/LO/1218) as part of a larger European studies (EU-AIMS, STEMBANCC). All participants were Caucasian, while controls were selected if they did not have diagnosis of any psychiatric conditions. These were reprogrammed into iPSCs using previously described methods 18, 27. IPS cells were cultured in E8 medium (Life Technologies) with E8 supplement (Life Technologies). Cell type quantification and proliferation assays were set up on 96-well plates. 2 clones were selected from each participant, and each clone had 8 technical replicates. RNA-sequencing was performed on 2 clones from each participant, and each clone had 2 technical replicates. This design was maintained at all stages of neural differentiation recorded. Induction of neurons of cortical lineage was established using a modified dual SMADi protocol17. Once the cell culture reached 95% confluence, neural induction was initiated by changing the culture medium to support neural induction, neurogenesis and neuronal differentiation. A combination of N2- and B27-containing media with additives was used, henceforth called ‘neuralising medium’. N2 medium consisted of DMEM/F12 (Sigma), N2 (Gibco). B27 medium consisted of Neurobasal (Invitrogen), B27 (Gibco). Neuralising medium was supplemented with ‘dual SMADi’ 1 μM Dorsomorphin (Tocris), 500 ng/ml human Noggin-CF chimera (R&D Systems) – inhibitors of WNT pathway, BMPs and SMAD, and 10 μM SB431542 (Tocris) – inhibitor of TGFβ signaling. Noggin and dorsomorphin supresses embryonic development thereby inducing neural differentiation pathways, while SB431542 mediates loss of pluripotency. Midbrain floorplate precursors were differentiated from all participants till day 11 using previously established protocols19, 20.
Immunocytochemistry
Cultures were fixed in 4% formaldehyde followed by ice-cold 100% methanol and processed for immunofluorescence staining, confocal microscopy and high throughput imaging. Secondary antibodies used for primary antibody detection were species-specific Alexa-dye conjugates (Invitrogen). We used the following primary antibodies to Ki67 (Thermo Fisher PA5-16785), Nestin (Millipore MAB5326), Pax6 (BioLegend 901301), TBR1 (Abcam ab31940), MAP2 (Abcam ab92434), Emx1 (ThermoFisher PA5-35373), Gad67 (Abcam ab26116), Tuj1 (BioLegend 801201), CD44 (R&D Systems MAB7045), LMX1A (Abcam ab139726), FOXA2 (Invitrogen 701698). Quantification was performed on the Perkin Elmer Harmony Software v4.9, which is based on the CellProfiler high throughput image analysis system60. Cell nuclei were first identified based on DAPI staining. For nuclear proteins, only the nuclear area was selected. For cytoplasmic protein, the area around the nucleus was selected. Thresholds of fluorescent intensity was selected after background subtraction. Threshold for each probe remained unchanged for every sample imaged. Antibodies, dilutions used and fluorescence threshold information in Supplementary Table S3.
RNA isolation and sequencing
RNA was extracted using TRIzol (Thermo Fischer) and 1-bromo-3-chloropropane (BCP; Sigma). To remove genomic DNA during processing, turbo DNase (Thermo Fischer) was used. RNA concentration was quantified using Ribogreen assay (Invitrogen).
Starting with 500ng of total RNA, poly(A) containing mRNA was purified and libraries were prepared using TruSeq Stranded mRNA kit (Illumina). Unstranded libraries were constructed and underwent 50bp single ended sequencing on an Illumina HiSeq 2500 machine. To analyse iPSC mRNA-seq data, the raw reads were mapped to the human genome GRCh37.75 (UCSC version hg19) using STAR: RNA-seq aligner29. Aligned reads were sorted using samtools61, while biases were removed using Picard tools (Broad Institute). Quality control was performed using Picard tools (Broad Institute) and QoRTs 30 (Supplementary Figure S9). Gene expression levels were quantified using an union exon model with HTSeq 31, which uses uniquely aligned reads. Only the genes with >10 reads and expressed in 80% of the samples, were kept. The resulting read counts were log2 transformed and GC content, gene length, and library size normalised using the cqn package 62 in R.
mRNA weighted co-expression network analysis
Co-expression network analysis was performed using the R library, WGCNA33. We wanted to investigate autism-specific iPSC-neuronal culture co-expressed genes (or modules). Biweighted mid-correlations were calculated for all pairs of genes, then a signed similarity matrix was created. In the signed network, the similarity between genes reflects the sign of the correlation of their expression profiles. The signed similarity matrix was then raised to power β to emphasize strong correlations on an exponential scale. The resulting matrix (known as adjacency matrix) was then transformed into a topological overlap matrix. Since we are primarily interested in exploring co-expressed genes conserved across our cohort, we created consensus networks correlated to autism as previously published 14, 35. After scaling for each individual network (consensus scaling quantile = 0.2), a soft thresholding power of 14 was chosen (as it was the smallest threshold that resulted in a scale-free R2 fit of 0.8) (Supplementary Figure S10). The consensus network was created by using a topological overlap matrix (TOM) to calculate the component-wise minimum values for topological overlap. Using dissTOM = 1 – TOM as distance measure, genes were hierarchically clustered. Modules were then assigned using a dynamic tree-cutting algorithm (cutreeHybrid, using default parameters except deepSplit = 4, cutHeight = 0.999, minModulesize = 100, dthresh=0.1 and pamStage = FALSE).
Resulting modules of co-expressed genes were used to calculate module eigengenes (MEs; or 1st principal component of the module). MEs were correlated to biological traits, in this case autism, to find disease-specific modules. Module hubs were defined by calculating module membership (kME) values which are the Pearson correlation between each gene and corresponding ME, and genes with kME < 0.7 were removed from the module. Network visualisation was done using iGraph package in R 63.
Module preservation analysis
Module preservation analysis was performed to validate co-expression in a previous autism-iPSC study 22. Module values from autism-iPSC network analysis were used as reference, to calculate the Zsummary statistic for each module. This measure combines module density and intramodular connectivity metrics to give a composite statistic where Z > 2 suggests moderate preservation and Z > 10 suggests high preservation 64.
Transcription factor binding site enrichment
The top 200 genes in each module (ranked kME) were used for transcription factor binding site (TFBS) enrichment analysis (Supplementary Figure S7C).
Enrichment analysis for gene sets
Two types of gene set enrichments were performed. For autism-correlated module enrichment, logistic regression was performed using already published gene modules 14, 34, 35 to control for gene length and gene expression level. A two-sided Fisher exact test with 95% confidence interval was performed for cell-type enrichment analysis using published human brain dataset65.
Module genes were characterised using GO Elite (version 1.2.5) 38 using total expressed genes as background. GO Elite uses a Z-score approximation of hypergeometric distribution to assess term enrichment, and removes redundant GO or KEGG terms to give a concise output. 10,000 permutations were used, and required at least 10 genes to be enriched in a given pathway at a Z-score of at least 2. Only biological process and molecular function categories are reported.
Quantitative real time polymerase chain reaction (QPCR)
cDNA was isolated, and QPCR was performed on RNA samples used for RNA-sequencing to show relative quantification of EMX1 (F: CCCTCTCCATTTCTACC, R: ACGTAGTGGTTCTTCTC), Gad67 (F: GTTACCGAGGAGCTAAA, R: CAATGACTCTGCTACTATTT) and GABRA4 (F: CATCTACTGACTTCTTTCTC, R: CATTCACTCATCCATTCC) (Supplementary Figure S6C). PowerUp SYBR Green Master Mix (Thermo Scientific) was used to run QPCR, on QuantStudio 7. ΔΔCt method was used to analyse differential expression.
Genomic DNA isolation and exome sequencing
Genomic DNA from autism and control iPSCs was isolated using Promega ReliaPrep™ gDNA Tissue Miniprep System (Promega). RNase A treatment was performed to digest contaminating RNA, and proteinase K to digest proteins. DNA concentration was quantified using Picogreen assay (Invitrogen).
10ng/µl of genomic DNA was used for library preparation and exome enrichment using Nextera Rapid Capture Exomes (Illumina). Paired end libraries were constructed and sequenced using Illumina HiSeq 2500 machine. Sample data was aligned to human genome GRCh37.75 (UCSC version hg19), using the Burrows Wheeler Aligner (BWA) 66. Aligned reads were sorted according to chromosome number. Duplicate reads usually created during sequencing were identified and removed using Picard tools (Broad Institute). Quality scores were assigned to individual bases, then adjusted to reduce systematic errors using genome analysis toolkit (GATK, Broad Institute). SNPs (single nucleotide polymorphisms) and indels (insertions-deletions) were identified using GATK, by local de-novo assembly of haplotypes (haploid genotype). The SNPs and indels were then recalibrated to check that they were true genetic variants and not artefacts. Next, these variants were evaluated for ratio of transition mutations to transversions mutations (Ti/Tv), heterozygous/homozygous (het:hom) ratio, and insertion/deletion (indel) ratio. SNPs and indels identified by GATK were annotated using ANNOVAR 67 and variant effector predictor (VEP) 68. Important annotations include minor allele frequency from the 1000G project, SIFT score 69, PolyPhen 2 score 70, base change, and exonic function. Genotype concordance was performed using GATK to assess validity of SNPs from exome analysis (Supplementary Figure S5D).
Supplementary figure legends
Acknowledgments
We gratefully acknowledge the participants in this study. This study was supported by grants from the European Autism Interventions (EU-AIMS) and AIMS-2-TRIALS; the Wellcome Trust ISSF Grant (No. 097819) and the King’s Health Partners Research and Development Challenge Fund – a fund administered on behalf of King’s Health Partners by Guy’s and St Thomas’ Charity (Grant R130587) awarded to DPS; an Independent Investigator’s Award from the Brain and Behavior Foundation (formally National Alliance for Research on Schizophrenia and Depression (NARSAD); Grant No. 25957), and Seed funding from Medical Research Council, UK (MR/N026063/1) awarded to DPS; the Innovative Medicines Initiative Joint Undertaking under grant agreement no. 115300, resources of which are composed of financial contribution from the European Union’s Seventh Framework Programme (FP7/2007-2013) and EFPIA companies’ in kind contribution (JP, SBC, DPS, DM, GM); the European Union’s Seventh Framework Programme (FP7-HEALTH-603016) (DPS, JP); the Mortimer D Sackler Foundation; the Autism Research Trust, the Chinese University of Hong Kong, and a doctoral fellowship from the Jawaharlal Nehru Memorial Trust awarded to D.A. SBC was funded by the Autism Research Trust, the Wellcome Trust, the Templeton World Charitable Foundation, and the NIHR Biomedical Research Centre in Cambridge, during the period of this work. The funding organizations had no role in the design and conduct of the study, in the collection, management, analysis and interpretation of the data, or in the preparation, review or approval of the manuscript. We are grateful to Debbie Spain and Suzanne Coghlan for participant recruitment, to Rosy Watkins, Hema Pramod, Rupert Faraway, Pooja Raval, Kate Sellers, Michael Deans and Rodrigo Rafagnin for assistance during the study, and to Aicha Massrali, Arkoprovo Paul, Bhismadev Chakrabarti, Michael Lombardo, Rick Livesey and Mark Kotter for valuable discussions. We thank the Wohl Cellular Imaging Centre (WCIC) at the IoPPN, Kings College, London for help with microscopy.