Abstract
Background Autism spectrum disorder (ASD) is a neurodevelopmental disorder that affects more than 1% of children in the United States. ASD risk is thought to arise from a combination of genetic and environmental factors, with the perinatal period as a critical window. Understanding early transcriptional changes in ASD would assist in clarifying disease pathogenesis and identifying biomarkers and treatments. However, little is known about umbilical cord blood gene expression profiles in babies later diagnosed with ASD compared to non-typically developing (Non-TD) or neurotypical children.
Methods Genome-wide transcript levels were measured by Affymetrix Human Gene 2.0 array in RNA from umbilical cord blood samples from both the Markers of Autism Risk in Babies--Learning Early Signs (MARBLES) and the Early Autism Risk Longitudinal Investigation (EARLI) high-risk pregnancy cohorts that enroll younger siblings of a child previously diagnosed with ASD. An algorithm-based diagnosis from 36 month assessments categorized the younger sibling as either ASD, typically developing (TD), or not ASD but non-typically developing (Non-TD). 59 ASD, 92 Non-TD, and 120 TD subjects were included and differences were identified in ASD versus TD subjects, with Non-TD versus TD as a specificity control. Meta-analysis was used to combine the results from both studies. Functional enrichments of differentially-expressed genes were examined across diagnostic groups.
Results While cord blood gene expression differences comparing either ASD or Non-TD to TD did not reach genome-wide significance when adjusting for multiple comparisons, 172 genes were nominally differentially-expressed between ASD and TD cord blood (log2(fold change) > 0.1, p < 0.01). These genes were significantly enriched for toxic substance response and xenobiotic metabolism functions, and gene sets involved in chromatin regulation and systemic lupus erythematosus were significantly upregulated (FDR q < 0.05). In contrast, 66 genes were differentially-expressed between Non-TD and TD cord blood, including only 8 genes that were also differentially-expressed in ASD.
Conclusions This is the first study to identify perinatal gene expression differences in umbilical cord blood specific to ASD. The results of this meta-analysis across two prospective ASD cohorts support involvement of environmental, immune, and epigenetic mechanisms in ASD etiology.
Background
Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by impaired social interaction and restricted and repetitive behaviors. Heritability of ASD risk has been well established with twin and family studies and estimated at 52% [1–3]. While rare variants with large effects explain a relatively small proportion of all ASD cases, heritable common variants with individually minor effects contribute substantially to ASD risk [4]. Accumulating lines of evidence suggest that ASD arises from complex interactions between heterogeneous genetic and environmental risk factors. Gene expression levels are influenced by both genetic and environmental factors and determine the functional responses of cells and tissues. Postmortem brain gene expression studies have guided understanding of ASD pathophysiology and show evidence of changes in gene co-expression and enrichment in immune response and neuronal activity functions [5, 6]. Peripheral blood gene expression studies in children and adults using whole blood and in specific cell types (natural killer (NK) cell and lymphocytes) observed enrichment of immune and inflammatory processes in differential gene expression associated with ASD [7, 8]. Recent efforts have been focused on identifying how genetic risk factors converge into one or more unifying pathways and pathophysiological mechanisms [9, 10]. Yet, the majority of this work to date relies on post-mortem or post-symptom timing of sample collection, rather than prospective assessment of gene expression.
Converging evidence suggests that most of the changes in the brain associated with ASD are initiated during prenatal brain development [11, 12], but the complete nature of these changes remain unknown. Umbilical cord blood captures fetal blood as well as the exchanges across the feto-placental unit and provides a distinct insight into prenatal development. A unique cell mixture is represented in umbilical cord blood, including hematopoietic stem cells, B cells, NK cells, T cells, monocytes, granulocytes and nucleated red blood cells [13]. Cord blood gene expression would reflect the immune response as well as endocrine and cellular communication essential for fetal development near the time of birth.
While several studies have previously examined child blood gene expression differences in ASD [8, 14-20], this is the first study to take advantage of cord blood samples collected from two prospective studies (Markers of Autism Risk in Babies--Learning Early Signs (MARBLES) and the Early Autism Risk Longitudinal Investigation (EARLI)) in order to assess the perinatal transcriptional changes that precede ASD diagnosis in high-risk children [21, 22]. The subjects in this study are all siblings of children with ASD and thus have a 13-fold increased risk for ASD compared to the general population [23]. They are also at a higher risk for non-typical neurodevelopment, including deficits in attention and behavior. We measured cord blood gene expression levels using the Affymetrix Human Gene 2.0 array and compared the gene-level differential expression and gene set enrichment across ASD, non-typically developing (Non-TD) and neurotypical children (Fig. S1). Study-level results were then combined in a meta-analysis to investigate cord blood transcriptional dysregulation in ASD.
Methods
Sample population and biosample collection
MARBLES
The MARBLES study recruits Northern California mothers from lists of children receiving services through the California Department of Developmental Services who have a child with confirmed ASD and are planning a pregnancy or are pregnant with another child. Inclusion criteria for the study were: 1) mother or father has one or more biological child(ren) with ASD; 2) mother is 18 years or older; 3) mother is pregnant; 4) mother speaks, reads, and understands English sufficiently to complete the protocol and the younger sibling will be taught to speak English; 5) mother lives within 2.5 hours of the Davis/Sacramento region at time of enrollment. As described in more detail elsewhere [21], demographic, diet, lifestyle, environmental, and medical information were prospectively collected through telephone-assisted interviews and mailed questionnaires throughout pregnancy and the postnatal period. Mothers were provided with sampling kits for cord blood collection prior to delivery. MARBLES research staff made arrangements with obstetricians/midwives and birth hospital labor and delivery staff to assure proper sample collection and temporary storage. Infants received standardized neurodevelopmental assessments beginning at 6 months, as described below, and concluding at 3 years of age. For this study, all children actively enrolled by March 1, 2017 (n = 347) with umbilical cord blood collected in a PAXgene Blood RNA tube (n = 262, 76%) were included.
EARLI
The EARLI study is a high-risk pregnancy cohort that recruited and followed pregnant mothers who had an older child diagnosed with ASD through pregnancy, birth, and the first three years of life. EARLI families were recruited at four EARLI Network sites (Drexel/Children’s Hospital of Philadelphia; Johns Hopkins/Kennedy Krieger Institute; University of California (UC) Davis; and Kaiser Permanente Northern California) in three distinct US regions (Southeast Pennsylvania, Northeast Maryland, and Northern California). In addition to having a biological child with ASD confirmed by EARLI study clinicians, to be eligible mothers also had to communicate in English or Spanish and, at recruitment, meet the following criteria: be 18 years or older; live within two hours of a study site; and be less than 29 weeks pregnant. The design of the EARLI study is described in more detail in Newschaffer et al. [22]. EARLI research staff made arrangements with obstetricians/midwives and birth hospital labor and delivery staff to ensure proper cord blood sample collection and temporary storage. The development of children born into the cohort was closely followed through age three years. For this study, 212 infants born into EARLI as a singleton birth and followed to one year of age were considered for inclusion. Of the 212 infants, 97 were excluded because they were either missing umbilical cord blood samples or outcome measures at 36 months, leaving a final sample of 115.
Diagnostic outcomes
In both studies, development was assessed by trained, reliable examiners. Diagnostic assessments at three years included the gold standard Autism Diagnostic Observation Schedule (ADOS) [24, 25], the Autism Diagnostic Interview-Revised (ADI-R) [26] conducted with parents, and the Mullen Scales of Early Learning (MSEL) [27], a test of cognitive, language, and motor development. Participants were classified into one of three outcome groups, ASD, typically developing (TD), and Non-TD, based on a previously published algorithm that uses ADOS and MSEL scores [28, 29]. Children with ASD outcomes had scores over the ADOS cutoff and met DSM-5 criteria for ASD. The Non-TD group was defined as children with low MSEL scores (i.e., two or more MSEL subscales that are more than 1.5 standard deviations (SD) below average or at least one MSEL subscale that was more than 2 SD below average), elevated ADOS scores (i.e., within 3 points of the ASD cutoff), or both. Children with TD outcomes had all MSEL scores within 2 SD and no more than one MSEL subscale 1.5 SD below the normative mean and scores on the ADOS at least three or more points below the ASD cutoff.
RNA isolation and expression assessment
In both EARLI and MARBLES, umbilical cord blood was collected at the time of birth in PAXgene Blood RNA Tubes with the RNA stabilization reagent (BD Biosciences) and stored at - 80°C. RNA isolation was performed with the PAXgene Blood RNA Kit (Qiagen) following the manufacturer’s protocol. RNA from 236 (90%) of the 262 MARBLES PAXgene blood samples and all of the EARLI PAXgene blood samples met quality control standards (RIN ≥ 7.0 and concentration ≥ 35ng/uL) and volume requirements. Total RNA was converted to cDNA and in vitro transcribed to biotin-labeled cRNA, which was hybridized to Human Gene 2.0 Affymetrix microarray chips by the Johns Hopkins Sequencing and Microarray core. EARLI and MARBLES samples were measured separately and in multiple batches within each study. The manufacturer’s protocol was followed for all washing, staining and scanning procedures. The raw fluorescence data (in Affymetrix CEL file format) with one perfect match and one mismatched probe in each set were analyzed using oligo package in R.
Data preprocessing
Within each study, signal distribution was first assessed in perfect-match probe intensity and Robust Multi-Chip Average (RMA) normalized data [30]. During the quality control step, we identified outliers using the arrayQualityMetrics and oligo R packages [31, 32]. Outliers were excluded based on loading in principal component 1, the Kolmogorov-Smirnov test, median normalized unscaled standard error, and the sum of the distances to all other arrays. For the MARBLES study, 3 outlier samples were identified and excluded, and another 71 children had not yet received a diagnosis by April 12, 2018 so were excluded; 162 samples were normalized using RMA. For the EARLI study, 6 outliers were identified and excluded, then 109 samples were normalized using RMA. Probes were annotated at the transcript level using the pd.hugene.2.0.st R package [33], and those assigned to a gene (36,459 probes) were used in subsequent analyses.
Differential gene expression
We used surrogate variable analysis (SVA) to estimate and adjust for unmeasured environmental, demographic, cell type proportion, and technical factors that may have substantial effects on gene expression using the SVA R package [34]. 21 and 11 surrogate variables were detected in normalized expression data from the MARBLES and EARLI studies, respectively. Specific factors associated with surrogate variables in this study included sex and array batch. Differential expression was determined using the limma package in R with diagnosis and surrogate variables included in the linear model [35] (Fig. S2, S3). ASD versus TD and Non-TD versus TD differential expression results were extracted from a single model with three levels for diagnosis. Fold change and standard error from each study were input into the METAL command-line tool for meta-analysis [36]. Using the meta-analyzed data, differential probes were then identified as those with a nominal p-value less than 0.01 and an average absolute log2(fold change) greater than 0.1.
Gene overlap analysis
Gene overlap analysis by Fisher’s exact test was performed using the GeneOverlap R package [37]. Gene symbols annotated to differentially-expressed probes were compared to autism-related or blood cell-type associated gene lists [38] for overlap relative to all genes annotated to probes on the array. Genes with variation previously associated with autism were obtained from the Simons Foundation Autism Research Initiative (SFARI) Gene database and a recent genome-wide association study meta-analysis [39, 40], while genes with expression previously associated with autism were obtained from multiple previous reports [6, 8, 41, 42]. Significant overlaps were those with a false discovery rate (FDR) q-value < 0.05.
Overrepresentation enrichment analysis
Differential probes identified during meta-analysis were converted to Entrez gene IDs using the biomaRt R package [43]. Functional enrichment of only differential probes by hypergeometric test was relative to all probes on the array and was performed using the WebGestalt online tool with default parameters for the overrepresentation enrichment analysis method [44]. Enrichment databases included WebGestalt defaults and also a custom database of recently evolved genes obtained from [45]. WebGestalt default databases queried included Gene Ontology, KEGG, WikiPathways, Reactome, PANTHER, MSigDB, Human Phenotype Ontology, DisGeNET, OMIM, PharmGKB, and DrugBank. Significant enrichments were those with an FDR q-value < 0.05.
Gene set enrichment analysis (GSEA)
All probes included in the analysis were ranked using meta-analysis log2(fold change) and input into the WebGestalt online tool using default parameters for the GSEA method [44]. GSEA assesses whether genes in biologically-predefined sets occur toward the top or bottom of a ranked list of all examined genes more than expected by chance [46]. GSEA calculates an enrichment score normalized to the set size to estimate the extent of non-random distribution of the predefined gene set, and it then tests the significance of the enrichment with a permutation test. Enrichment databases included WebGestalt defaults (see above) and also a custom database of blood cell-type associated genes [38]. Significant gene sets were called as those with an FDR q-value < 0.05.
Results
Study Sample Characteristics
MARBLES subjects in the final analysis included 77 TD (40 male, 37 female), 41 ASD (30 male, 11 female), and 44 Non-TD subjects (27 male, 17 female). Paternal age and gestational age were nominally associated with diagnostic group in MARBLES, with slightly increased paternal age and gestational age for the ASD subjects (paternal age p = 0.02, gestational age p = 0.04, Table 1). EARLI subjects in the final analysis included 43 TD (19 male, 24 female), 18 ASD (13 male, 5 female), and 48 Non-TD subjects (23 male, 25 female). Child race and ethnicity and home ownership were nominally associated with diagnostic group in EARLI (race and ethnicity p = 0.02, home ownership p = 0.01, Table 2). Specifically, the ASD group included a lower proportion of white subjects and a lower rate of home ownership. In the meta-analysis, which combined both the MARBLES and EARLI studies, gene expression was analyzed in 271 subjects, including 120 TD, 59 ASD, and 92 Non-TD subjects.
ASD-Associated Differential Gene Expression in Cord Blood
We examined differential expression of single genes in association with ASD diagnosis status at 36 months. In the meta-analysis, no transcripts were differentially expressed at a conservative FDR q-value < 0.05. Under the thresholds of log2(fold change) > 0.1 and nominal p-value < 0.01, 172 transcripts were differentially expressed between ASD and TD cord blood (ASD n = 59, TD n = 120, Fig. 1a, Table S1). Among these, 87 were upregulated and 85 were downregulated. The differential transcript with the greatest fold change was TUBB2A (log2(fold change) = 0.35, Fig. 1b,Table 3). Additionally, the estimated fold changes for differentially-expressed genes were correlated between the two studies (Pearson’s r = 0.80, p < 2.2E-16). Many of the differentially-expressed genes were noncoding or uncharacterized transcripts; however, the median expression of differentially-expressed genes was not lower than non-differentially-expressed genes on the array (MARBLES: differential = 4.70, non-differential = 4.64, p = 0.74; EARLI: differential = 4.34, non-differential = 4.19, p = 0.52; Fig. S4).
Several differentially-expressed genes in cord blood overlapped with genes previously associated with ASD in genetic or gene expression studies, including SLC7A3, VSIG4, MIR1226, SNORD104, OR2AG2, and DHX30, although the overlap was not statistically significant (q > 0.05, Fig. S5, Table S2). Two known ASD genes in the SFARI Gene database, SLC7A3 and VSIG4, were upregulated in cord blood from ASD subjects (SLC7A3: log2(fold change) = 0.16, p = 2.6E-4; VSIG4: log2(fold change) = 0.12, p = 2.2E-3) [39]. Additionally, MIR1226, previously associated with ASD in a large genome-wide association study meta-analysis, was downregulated in ASD subject cord blood (log2(fold change) = −0.14, p = 8.6E-4) [40]. SNORD104 was upregulated and OR2AG2was downregulated in cord blood, and both were differentially expressed in the same direction in a previous ASD expression meta-analysis in blood (SNORD104: log2(fold change) = 0.20, p = 9.2E-3; OR2AG2: log2(fold change) = −0.14, p = 2.8E-4) [8]. DHX30 was downregulated in cord blood and also in a previous autism expression study in post-mortem cortex (log2(fold change) = −0.14, p = 8.6E-4) [41]. GFI1, GPR171, KIR2DL4, PTGIR, and TRPM6 were differentially expressed in ASD cord blood and are also differentially expressed in specific blood cell types, including natural killer cells and T cells, although a significant enrichment was not observed (q > 0.05, Fig. S6, Table S3) [38].
Overrepresentation enrichment analysis, which looks for overlap of only differentially-expressed genes with biologically-predefined gene lists, revealed that ASD differential transcripts were significantly enriched for functions in the response to toxic substances (fold enrichment = 9.5, q = 0.027) and ultraviolet radiation (fold enrichment = 7.6, q = 0.037, Fig. 1c,Table S4). Genes associated with both of these annotations included CYP1A1, FOS, and GCH1. Additionally, downregulated transcripts were enriched for functioning in blood coagulation (GNG12, MAFF, PF4, and PLG,fold enrichment = 12.5, q = 0.009) and xenobiotic metabolism (CDO1, CYP1A1, GCH1, and PLG, fold enrichment = 8.6, q = 0.019).
Using fold changes to rank all transcripts for gene set enrichment analysis (GSEA), we observed significant enrichment for upregulation of gene sets involved in chromatin regulation (q < 0.05, Fig. 2,Table S5). In other words, genes associated with chromatin regulation tended to be ranked toward the top of the distribution of fold change in ASD cord blood. Chromatin gene sets upregulated in ASD included DNA methylation (normalized enrichment score (NES) = 2.16, q = 0.009), condensation of prophase chromosomes (NES = 2.11, q = 0.007), nucleosome assembly (NES = 1.96, q = 0.026), histone deacetylase (HDAC)-mediated deacetylation (NES = 1.90, q = 0.040), and polycomb-repressive complex 2 (PRC2)-mediated methylation (NES = 1.89, q = 0.002). Additionally, the gene set for the autoimmune disease systemic lupus erythematosus was significantly upregulated (NES = 2.13, q = 0.003). Most of the genes associated with these sets compose a cluster of histone genes located at the 6p22.2 locus, which was also enriched (NES = 2.15, q = 0.007). The above findings of differential expression across two prospective cohorts suggest transcriptional dysregulation in environmentally-responsive genes is present at birth in cord blood of high-risk subjects later diagnosed with ASD.
Non-TD-Associated Differential Gene Expression in Cord Blood
To assess the specificity of ASD-associated transcriptional changes in cord blood, we also examined differential expression between Non-TD and TD cord blood samples. Meta-analysis results showed no transcripts differentially expressed at a conservative FDR q-value < 0.05. Under the thresholds of log2(fold change) > 0.1 and nominal p-value < 0.01, 66 transcripts were differential, with 38 upregulated and 28 downregulated (Non-TD n = 92, TD n = 120, Fig. 3a, Table S6). The gene with the greatest fold change between Non-TD and TD subjects was TAS2R46 (log2(fold change) = 0.37, Fig. 3b, Table 4). Further, the estimated fold changes of Non-TD-associated differentially-expressed genes were correlated between the individual studies (Pearson’s r = 0.80, p = 3.9E-16). Although many of the differential transcripts have limited known functions in cord blood, median expression of these genes was not different from other genes on the array (MARBLES: differential = 4.48, non-differential = 4.64, p = 0.65; EARLI: differential = 4.15, non-differential = 4.20, p = 0.90; Fig. S7).
Several of the 66 nominally differentially-expressed genes between Non-TD and TD cord blood samples, including DHCR24, GNAO1, MIR4269, and TYMS, have been previously associated with genetic variation or gene expression in ASD, although the overlap was not statistically significant (q > 0.05, Fig. S5, Table S2). MIR4269 was upregulated in Non-TD cord blood (log2(fold change) = 0.12, p = 7.3E-3), and genetic deficiencies in MIR4269 have been previously associated with a reduced risk for ASD [40]. When comparing Non-TD differentially-expressed genes in this study to previous expression studies of ASD, TYMS and DHCR24 were downregulated in Non-TD cord blood and also in ASD blood and brain, respectively (TYMS: log2(fold change) = −0.13, p = 7.6E-4; DHCR24: log2(fold change) = −0.11, p = 2.4E-3)[8, 41]. Overlapping differentially expressed genes in both ASD and Non-TD comparisons with TD likely reflect genes that function in general neurodevelopment.
Because genes recently evolved in primates have been hypothesized to play a role in human neurodevelopment, differentially-expressed genes in Non-TD cord blood were assessed for enrichment in recently evolved genes by vertebrate lineages, ranging from tetrapods to homininae using overrepresentation enrichment analysis [45]. Non-TD-associated genes were significantly enriched for genes recently evolved in mammalia, theria, eutheria, boreoeutheria, euarchontoglires, primates, catarrhini, and hominidae, with the greatest enrichment in primate-specific genes (fold enrichment = 7.5, q = 2.1E-5, Fig. 3c, Table S7). Of genes that evolved in primates, SLC52A1, SPANXN5, and TRIM49B were upregulated in Non-TD cord blood, while FAM86C1, RASA4, RASA4B, and TRIM73 were downregulated (Fig. 3d). In contrast, ASD differentially-expressed genes were not significantly enriched in recently evolved genes from any of the vertebrate lineages (q > 0.05).
After GSEA with all probes ranked by fold change in Non-TD compared to TD subjects, we observed significant enrichment for upregulation of sensory perception gene sets (q < 0.05, Fig. 4a, Table S8). Taste receptor activity (NES = 2.30, q < 1E-4), metabotropic glutamate receptors (NES = 2.21, q = 4.9E-3), and olfactory receptor activity (NES = 1.96, q = 0.018) gene sets were all upregulated in cord blood from Non-TD subjects. Additionally, gene sets that interact with the compounds quinine (NES = 2.30, q = 2.0E-3) and citric acid (NES = 2.17, q = 2.5E-3) were significantly upregulated, while those interacting with indomethacin (NES = −2.02, q = 0.037) and H2-receptor antagonists (NES = −2.03, q = 0.047) were downregulated. Taste receptor genes included in these gene sets and the top Non-TD-upregulated gene, TAS2R46, are located at the 12p13.2 locus, which was also enriched (NES = 2.11, q = 8.3E-3). GSEA was also used to compare Non-TD fold change rankings to genes with cell-type specific expression [38]. Genes upregulated in memory B cells were upregulated in cord blood from Non-TD subjects (NES = 1.69, q = 0.025), while genes upregulated in NK cells and mast cells were downregulated (q < 0.05, Fig. 4b, Table S8). This suggests a shift in the proportions of these cell types in cord blood of Non-TD subjects.
Comparison of ASD and Non-TD Differentially-Expressed Genes
Eight genes were differentially expressed in both ASD and Non-TD compared to TD subjects, much more than expected by chance (odds ratio = 28.3, p = 1.67E-9, Fig. 5A). Specifically, IGLV1-40, LRRC37A4P, MIR1299, PMCHL2, and TRBV11-2 were upregulated, while RNU4ATAC11P, TPTE2P5, and TRIM74 were downregulated in both ASD and Non-TD subjects (Fig. 5B). Further, LRRC37AP, MIR1299, PMCHL2, and TRBV11-2 were among the top upregulated genes in Non-TD subjects (Fig. 3B). These findings suggest that even though some ASD-associated transcriptional alterations in cord blood are also present in Non-TD subjects, the majority of changes are specific to ASD subjects.
Discussion
Perinatal transcriptional alterations in ASD
In this study, we identified gene expression differences in cord blood between high-risk children who went on to develop ASD, were Non-TD, or were TD at 36 months. The results are based on meta-analysis across two high-risk pregnancy cohorts of similar design. Several of these differentially-expressed genes have been previously associated with neurodevelopmental disorders such as ASD, via genetic variants or previous post-diagnosis expression studies, although overlap with prior findings was not statistically significant. Included in the top upregulated genes in ASD was TUBB2A, a component of microtubules. TUBB2A is expressed in fetal brain and is mutated in complex cortical dysplasia, a neurodevelopmental disorder that involves seizures [47]. SLC7A3 and VSIG4 were also upregulated in ASD and have been previously associated as putative ASD risk genes [39]. Rare missense variants have been found in ASD subjects at SLC7A3, which is a cationic amino acid transporter selectively expressed in brain [48]. Nonsense and splice site variants have been found in ASD subjects at VSIG4, a complement C3 receptor that functions in pathogen recognition and inhibition of helper T-cell activation [49–51]. Overlap of both genetic and transcriptional association with ASD suggests that dysregulation in these genes may contribute to ASD risk.
Environmental factors are also thought to contribute to ASD risk, especially during the perinatal period, a sensitive window for neurodevelopment [52]. Genes differentially expressed in cord blood from ASD subjects were significantly enriched for functions in xenobiotic metabolism and response to toxic substances. Further, few genes with prior genetic associations to ASD showed differential cord blood expression, which helps elucidate the importance of environmental factors in ASD. Notably, CYP1A1 was downregulated in ASD cord blood and has been previously found to be transcriptionally regulated in blood by toxicants associated with neurodevelopment, including polychlorinated biphenyls [53–55]. GCH1--which is the rate-limiting enzyme in the synthesis of tetrahydrobiopterin, a precursor to folate, dopamine, and serotonin [56]--was also downregulated in cord blood from ASD subjects. GCH1 expression increases in response to valproic acid, an anti-epileptic drug associated with behavioral deficits in mice and increased risk of autism in humans after in utero exposure [57, 58]. Interestingly, GCH1 is genetically associated with ASD subphenotypes, is downregulated in peripheral blood from ASD children, and its product tetrahydrobiopterin is decreased in cerebrospinal fluid from ASD subjects [17, 59, 60].
Epigenetic modifications, such as those to DNA and histone protein subunits, are affected by both genetic and environmental factors and are thought to play a role in mediating ASD risk [61, 62]. Immune dysregulation has also been found in children with ASD, and immune cells rely on epigenetic regulation for lineage commitment and cell activation in response to infection [63, 64]. A cluster of histone genes at 6p22.2 was enriched for upregulated genes in ASD cord blood. One of the genes in this cluster was HIST1H1E, which encodes H1.4. C-terminal domain truncating mutations in HIST1H1E have been associated with autism and Rahman syndrome, a rare disorder that includes intellectual disability and somatic overgrowth [65, 66]. Genes associated with the autoimmune disease systemic lupus erythematosus (SLE) were also upregulated, including histone-encoding, complement pathway, and antigen presentation genes. Epigenetic dysregulation is a feature of SLE, including global DNA hypomethylation, histone H3 and H4 hypoacetylation, and H3K9 hypomethylation in CD4+ T cells [67–69]. Notably, maternal SLE increases risk for ASD in offspring, suggesting an association between these two disorders [70]. Together, this implicates both epigenetic and immune mechanisms in ASD pathobiology.
Perinatal transcriptional alterations in Non-TD
To assess the specificity of cord blood gene expression changes in ASD compared to other neurodevelopmental disorders, this analysis examined transcriptional differences in Non-TD compared to TD subjects across the two studies. The top upregulated gene in Non-TD cord blood was TAS2R46, encoding a member of the taste 2 receptor (TAS2R) family, while the top upregulated gene set was taste receptor activity. Upregulated genes in this gene set were primarily other TAS2Rs located at the 12p13.2 locus. TAS2Rs are G-protein coupled receptors (GPCRs) expressed in taste receptor cells and are associated with the perception of bitter taste [71]. Interestingly, individuals with attention-deficit/hyperactivity disorder and epilepsy have previously been found to perceive bitter tastes more intensely than healthy controls [72, 73]. TAS2Rs are also expressed in blood leukocytes, including NK cells, B cells and T cells, where they may function in chemosensation of foodborne flavor compounds and response to food uptake [74]. Further, TAS2Rs are upregulated in leukocytes from asthma patients and levels of lipopolysaccharide-induced pro-inflammatory cytokines are decreased by TAS2R agonists [75]. Taste receptor upregulation may reflect altered chemosensation in the immune and nervous systems in Non-TD subjects.
Differentially-expressed genes in cord blood from Non-TD subjects were also enriched for genes recently evolved in mammals, and especially for primate-specific genes. These young genes originated at a similar evolutionary time that the neocortex expanded and have biased expression for the fetal neocortex in humans [45]. SLC52A1 and RASA4 were among the fetal-biased primate-specific genes that were differentially expressed in Non-TD subjects. SLC52A1, which was among the top upregulated genes in Non-TD cord blood, is a transporter involved in the placental uptake of riboflavin, a vitamin important for oxidation-reduction reactions [76]. RASA4 is a GTPase-activating protein in the Ras signaling pathway that functions in the activation of T cells, mast cells, and macrophages and was also one of the top downregulated genes in cord blood from Non-TD subjects [77–79]. Differential expression of fetal-biased primate-specific genes in Non-TD cord blood may reflect genetic changes that alter expression in multiple tissues.
Children with Non-TD were also observed to have downregulation of GNAO1, encoding a G-protein alpha subunit important for neurodevelopment and synaptic signaling. Mutations in GNAO1 are associated with epileptic encephalopathy, involuntary movements, and intellectual disability [80, 81]. Additionally, missense mutations and downregulation of GNAO1 in lymphocytes occur in individuals with schizophrenia [82, 83]. In individuals with ASD, GNAO1 is upregulated in post-mortem cortex [6]. In murine astrocytes, Gnao1 and other G-protein effectors are downregulated and intracellular calcium is increased after activation by inflammatory mediators [84]. Further, GNAO1 is required in mast cells for toll-like receptor 2-mediated pro-inflammatory cytokine release, suggesting GNAO1 functions in cell signaling in both the nervous and immune systems [85]. Interestingly, we also observed a downregulation of genes highly expressed in mast cells in cord blood from children with Non-TD. Therefore, further investigation into the Non-TD diagnosis group in high-risk families may continue to reveal possible insights into protective gene expression patterns in children resilient to the ASD diagnosis.
Cord blood as a window into transcriptional alterations specific to ASD
Umbilical cord blood gene expression offers a unique snapshot of molecular differences in perinatal life, a critical window for neurodevelopment [86]. Hematopoietic cells in the blood are actively differentiating and responding to environmental cues, such as pathogens and xenobiotics [87, 88]. Epigenetic marks written during this period, which reflect short-term transcriptional activity, have the potential to have long-term effects on gene regulation and cell function [89, 90]. Signals from the immune system cross the blood-brain barrier during gestation and influence the development of the brain [91]. Toxicant exposure during gestation can also impact brain development [92, 93]. In this study, genes involved in toxic substance response, xenobiotic metabolism, and chromatin regulation were altered in cord blood from subjects diagnosed with ASD at 36 months. Transcriptional differences in cord blood from ASD and Non-TD subjects compared to TD subjects were largely independent, with only 8 genes in common. Enriched gene sets associated with Non-TD expression included sensory perception and primate-specific genes and did not overlap with ASD expression gene sets. Further, genes associated with ASD in previous studies of genetic variation and gene expression had few overlaps with ASD-associated genes in cord blood [6, 8, 39–41]. Instead, cord blood ASD genes likely represent tissue-specific environmentally-responsive genes that may reflect in utero exposures and long-term altered neurodevelopmental outcomes.
Conclusions
In the first study to investigate gene expression in cord blood from high-risk newborns later diagnosed with ASD, we identified nominally statistically significant transcriptional alterations specific to ASD, which were enriched for toxic substance response and epigenetic regulation functions. Differentially-expressed genes in ASD had few overlaps with those identified in cord blood from newborns with other non-typical neurodevelopmental outcomes in this high-risk population. Instead, Non-TD-associated genes were enriched for sensory perception functions and primate-specific genes. A strength of this high-risk prospective study design was the observation of gene expression at birth, prior to the onset of symptoms, diagnosis, and treatment. Perinatal life is a critical period for neurodevelopment, where environmental stressors could have long-term impact. Additionally, ASD-associated differential expression was meta-analyzed across two independent studies with harmonized methods and compared with expression changes in other subjects with non-typical neurodevelopment. Finally, cord blood is an accessible tissue that reflects the perinatal environment, and ASD-associated gene expression changes in cord blood may have potential as a predictive biomarker. The limitations of this study include the transcript coverage and measurement precision of the expression microarray platform. Transcripts not covered by a probe on the array were not analyzed in this study, and targeted quantitative analysis in the general population would be needed to validate specific transcriptional changes as ASD risk biomarkers. Additionally, genetic, epigenetic, and environmental variation is known to impact both gene expression and ASD risk, but this was not investigated in this study. Future studies that integrate cord blood gene expression with genetic, epigenetic, and environmental factors will be important to improve understanding of ASD etiology.
Declarations
Ethics approval and consent to participate
The UC Davis Institutional Review Board and the State of California Committee for the Protection of Human Subjects approved this study and the MARBLES Study protocols. Human Subjects Institutional Review Boards at each of the four sites in the EARLI Study approved this study and the EARLI Study protocols. Neither data nor specimens were collected until written informed consent was obtained from the parents.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Funding
This work was supported by NIH grants: P01ES011269, R24ES028533, R01ES028089, R01ES020392, R01ES025574, R01ES025531, R01ES017646, U54HD079125, and K12HD051958; EPA STAR grant RD-83329201; and the MIND Institute.
Authors’ contributions
CEM and BYP were the lead authors and contributed substantially to data analysis, data interpretation, and drafting the manuscript. KMB contributed critical advice on data analysis methods and data interpretation. JIF and CL contributed to data analysis and interpretation. LAC, CJN, HEV, SO, and IH contributed to study design, as well as subject acquisition, diagnosis, and characterization. JML, RJS, and MDF conceived of the study and contributed substantially to data interpretation and manuscript revision. All authors read and approved the final manuscript.
Acknowledgements
Not applicable.
Footnotes
cemordaunt{at}ucdavis.edu, bopark{at}fullerton.edu, bakulski{at}umich.edu, jfeinbe2{at}jhu.edu, Lisa.A.Croen{at}kp.org, claddaco{at}jhsph.edu, cjn32{at}drexel.edu, hvolk1{at}jhu.edu, sozonoff{at}ucdavis.edu, iher{at}ucdavis.edu, jmlasalle{at}ucdavis.edu, rjschmidt{at}ucdavis.edu, dfallin{at}jhu.edu
Author information updated
List of abbreviations
- (ADI-R)
- Autism Diagnostic Interview-Revised
- (ADOS)
- Autism Diagnostic Observation Schedule
- (ASD)
- autism spectrum disorder
- (EARLI)
- Early Autism Risk Longitudinal Investigation
- (FDR)
- false discovery rate
- (GSEA)
- gene set enrichment analysis
- (GPCRs)
- G-protein coupled receptors
- (HDAC)
- histone deacetylase
- (MARBLES)
- Markers of Autism Risk in Babies - Learning Early Signs
- (MSEL)
- Mullen Scales of Early Learning
- (NK cells)
- natural killer cells
- (Non-TD)
- non-typically developing
- (NES)
- normalized enrichment score
- (PRC2)
- polycomb-repressive complex 2
- (RMA)
- robust multi-chip average
- (SFARI)
- Simons Foundation Autism Research Initiative
- (SVA)
- surrogate variable analysis
- (SLE)
- systemic lupus erythematosus
- (TAS2R)
- taste 2 receptor
- (TD)
- typically developing
- (UC)
- University of California
References
- 1.↵
- 2.
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.
- 15.
- 16.
- 17.↵
- 18.
- 19.
- 20.
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.
- 51.↵
- 52.↵
- 53.↵
- 54.
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵