Abstract
Genetic contributions of Neandertals to the modern human genome have been evidenced by comparative analyses of present day human genomes and paleogenomes. The Neandertal introgression differs in European, East Asian and African lines of descent, and is higher in Asians and Europeans and lower in Africans. Neandertal signatures in extant human genomes are attributed to intercrosses between Neandertals and ancient Homo sapiens lineages, or Anatomically Modern Humans (AMH) that migrated from Africa into the Middle East and Europe in the last 50,000 years. It has been proposed however that there is no contribution of Neandertal mitochondrial DNA to contemporary human genomes. Here we show that the modern human mitochondrial genome contains 75 Neandertal signatures of which 11 are associated with diseases such as cycling vomiting syndrome and depression and 3 associated with intelligence quotient. Principal component analysis and bootscan tests suggest rare recombination events. Also, contrary to what is observed in the nuclear genome, African mitochondrial haplogoups have more Neandertal signatures than Asian and European haplogroups. Our results suggest that although most intercrosses occurred between Neandertal males and Anatomically Modern Humans (AMH) females, crosses between AMH males and Neandertal females were extremely rare with also rare recombination events thus leaving few marks (75 out of 16,565bp) in present day mitochondrial genomes of human populations.
Neandertal genetic contributions to the modern makeup of the human genome have been evidenced by comparative analyses of present day human genomes and paleogenomes (1–4). These contributions are differential in European, East Asian and African lines of descent, with a higher frequency of Neandertal segments in Asians and Europeans and lower frequencies in Africans (1). The presence of Neandertal signatures in extant human genomes are attributed to intercrosses between Neandertals and ancient Homo sapiens lineages, or Anatomically Modern Humans (AMH) that migrated from Africa into the Middle East and Europe in the last 50,000 years (2, 3). The spatio-temporal overlap of Neandertals and AMH is estimated to be approximately 22,000 years since the first AMH arrived in Europe around 50,000 years ago and the last Neandertal remains (in Spain) date back to 28,000 years (5, 6). It has been proposed however that there is no contribution of Neandertal mitochondrial DNA to contemporary human genomes (7). Because of mtDNA matrilineal inheritance this implies that the all intercrosses occurred between Neandertal males and AMH females. Another possibility is that crosses between AMH males and Neandertal females were either extremely rare or yet, produced such unfavorable traits, via mitonuclear incompatibility (8), that none of its descendants left marks in present day human populations.
To investigate the proposed absence of Neandertal mitochondrial contribution in extant humans we compared mitochondrial genomes of extant human mitochondrial haplogroups, ancient AMH Homo sapiens and Neandertals. The whole mitochondrial genome alignment dataset revealed 918 polymorphic positions within the 16,565bp mtDNA. Within these 918 positions, 75 contained variants that were identical between present day humans and Neandertals at the exclusion of ancient AMH (Fig. 1). There are 175 positions in which 1 or more Neandertals are different from all other sequences. In 4 positions the modern humans are identical to Neandertals although there are 1 or 2 Ancient H. sapiens positions identical to Neandertals. There are 11 positions in which 1 or more Ancient H. sapiens are different from all other sequences. There are 10 positions in which 1 to 3 Ancient H. sapiens are identical several modern humans.
Single position cladograms of five representative polymorphic positions depict this pattern, which represents 10% of all polymorphic positions in the human mitochondrial genome. Here we show the pattern of position 2,706 (12S rRNA) (Fig. 1). The patterns of positions 1,018 (12S rRNA gene), 3,010 (16S rRNA gene), 3,594 (NADH dehydrogenase 1 - ND1), 5,460 (NADH dehydrogenase 2 – ND2) and 16,519 in D-loop are in the Supplementary data (figs. S1-S5). It has been reported that mtDNA positions 3,010 and 5,460 contains a G to A transition associated with Cyclic Vomiting Syndrome (9–11) and Alzheimer’s & Parkinson’s Diseases (12,13) respectively (Table 1). It can be observed that the distribution of the clusters of modern human haplogroups with Neandertals vary among human haplogroups (Fig. 1).
Haplogroup L3 is more related to Eurasian haplogroups than to the most divergent African clusters L1 and L2" (14). L3 is the haplogroup from which all modern humans outside of Africa derive. The distribution of Neandertal specific signatures along the mitochondrial genome is compiled in Fig. 2. Columns represent each individual gene and rows correspond to mitochondrial haplogroups. Intergenic regions, at the exception of the Control Region, or D-loop, are not depicted because they are virtually absent in the human mitochondrial genome. Apart from the D-loop there is only one intergenic region, a 24bp segment between the end of COX2 and start of tRNA-Lys. This analysis reveals that the Neandertal specific signatures are more frequent in the African haplogroups (L0, L1, L2, L3, L4, L5 and L6), then South East Asians, Native Australians (M), Native Americans (C) and lastly in Europeans (U, X, W, K and H). Unexpectedly this pattern is the opposite as observed in the nuclear genomes. The Neandertal signatures are more frequent in the highly divergent D-loop and among coding regions, in NADH dehydrogenase subunit 4 (ND4) and Cytochrome B (CYTB) (Fig. 2). The 3’ half of the genome contains significantly more Neandertal signatures than the 5’ half. The central region of the mtDNA corresponds to a breakage-repair point where deletions occur (15). Therefore this is Neandertal signature 2706G shared with modern European haplogroups U5a7a2, H1a1, H3, H15 and H2a2a1 (rCRS) but not with any of the Ancient H. sapiens sequences or the other modern haplogoups here analyzed.
Principal component analysis (PCA) of the whole mitochondrial genome shows four clusters: (1) the modern haplogroups including the ancient H. sapiens (table S2), (2) the L haplogroup cluster, (3) the Neandertal Altai-Mezmaskaya L-like cluster (table S3) and (4) the Neandertal group (Fig. 3). However the PCA of segment corresponding to the ribosomal RNA gene proximal half produces a pattern that approximates the L haplogroup cluster to the Neandertal Altai-Mezmaskaya L-like cluster suggesting the introgression point. The PCA of ribosomal RNA gene distal half suggests an opposite pattern with L cluster closer to Neandertal although not as close as shown in Fig. 3B.
Several Neandertal signatures are associated with disease and in particular the 15,043 G>A transition associated with major depression, a trait associated with Neandertal introgression in modern humans (16) (Table 1). Five Neandertal signatures correspond to SNPs associated with cycling vomiting syndrome with migraine, a condition known for its maternal inheritance (17) although never associated with Neandertal introgression (2) (Table 1). Also, among 21 SNPs belonging to the MspI mutation associated with Caucasians (18) three are Neandertal signatures (table S4). These mitochondrial genome SNPs have been associated with variation in intelligence quotient (IQ) in positions 16,189, 16,278 and 16,298 and are the same SNPs found in Neandertals (table S4).
It can be argued that the Neandertal signatures are in fact character states conserved since the last common ancestor of Neandertals and present day Homo sapiens (e.g. Homo erectus) but this would not be consistent with the absence of these signatures in ancient H. sapiens mitochondrial genomes (Fig. 1). Alternatively, the Neandertal signatures here described could be a consequence of random events. Because the random chance of the same nucleotide in a given position is 0.25 and there are 918 polymorphic positions in which 75 are Neandertal specific, it would roughly fit the 0.08 chance expectation. However, the chance that the 75 positions are simultaneously identical by chance alone would be roughly 0.2575 which is far less than observed here.
A back to Africa hypothesis has been proposed in which humans from Eurasia returned to Africa and impacted a wide range of sub-Saharan populations (19). Our data shows that Neandertal signatures are present in all major African haplogroups thus confirming that the Back to Africa contribution to the modern mitochondrial African pool was extensive.
Our observations suggest that crosses between AMH males and Neandertal females left significantly less descendants than the reverse crosses (Neandertal males and AMH females), which seems to be the dominant pattern. Although it is generally accepted that recombination does not occur in the human mitochondrial genome there is a controversy over reported evidence on mitochondrial recombination (20, 21). A scenario with complete absence of recombination presents a problem to explain how the human mitochondrial genome would escape the Miller ratchet and therefore avoiding its predicted “genetic meltdown” (22). It has been shown that even minimal recombination is sufficient to allow the escape from the Miller ratchet (23) and this could be the case of the human mitochondrial genome. We tested potential recombination in our dataset using bootscan (fig. S6). The bootscan analysis indicates that there are potential recombination points. Upon deeper analysis we observed that bootscan considers the Neandertal specific signatures, such as in L haplogroups, as recombination points. Although the bootscan putative recombination segments are above the bootstrap threshold we do not consider this as definitive evidence of recombination since the segments between the Neandertal signatures are almost identical. Bootscan analysis did exclude Human-Neandertal recombintation in rCRS sequence (fig. S7). Sensivity of bootscan to substitution models and alignment methods was assessed by comparing the same set query-parentals with different parameters (fig. S8), revealing minor profile alterations. The alignment parameters are not so critical in this case because the sequences are extremely conserved (918 polymorphic positions in 16,565bp). Although indels are present in the alignments, 99% are located near the H promoter in the D-loop region. These are automatically excluded in phylogeny inference algorithms and therefore have no weight in bootscan results. The "positional homology" is therefore solid, particularly in coding domains and regions without repeats in non-coding domains. The Neandertal signatures are in unambiguously aligned segments.
Our data is compatible with a scenario in which the AMH-Neandertal crosses occur in Europeans, East Asians and African lines of descent. However, in the African haplogroups the crosses between AMH males and Neandertal females would have a higher frequency than in European lines of descent, where the reverse crosses would be predominant. Based on the comparison of Neandertal signatures in nuclear and mitochondrial genome haplogroups we hypothesize that the African lines of descent would have a higher female Neandertal contribution whereas European lines of descent would have higher male Neandertal contribution. The fact that AMH and Neandertals crossed and produced fertile descendants is evidence that they belong to the same species (4) and thus indicate that Homo sapiens emerged independently in Africa, Europe and Asia (24). The intercrosses of these three Homo sapiens subgroups, and other even deeper ancestors such as Denisovans, in its different proportions and specific signatures, produced the extant human genomes.
Analyses presented here suggest that Neandertal genomic signatures might have been a product of rare mtDNA recombination events. Although there is evidence supporting mtDNA recombination its weight in phylogenies remain controversial. Some authors contend that due to its high mutation rate reverse compensatory mutations can be confounded with recombination in mtDNA. Our data supports a mtDNA recombination scenario in which recombination events are extremely rare thus producing a small number of Neandertal signatures.
Materials and Methods
Comparative analysis of the mitochondrial DNA from present day humans, ancient Homo sapiens and Neandertals, 52 sequences of modern human mtDNA, representing all major mitochondrial haplogroups (table S1), were selected from the PhyloTREE database (25) and downloaded from GenBank. Six ancient H. sapiens mtDNA and eight Neandertal mtDNA sequences were downloaded from GenBank (tables S1-S3). The Ust-Ishim sequence was assembled using reads downloaded from Study PRJEB6622 at the European Nucleotide Archive (EMBL-EBI) and assembled using the CLC Genomics Workbench 7 program (https://www.qiagenbioinformatics.com). To maintain the reference numbering, sequences were aligned to the revised Cambridge Reference Sequence (rRCS; GenBank accession number NC012920) (26), totalizing 68 sequences using the map to reference option implemented in Geneious 10 program (27). Variants were called using Geneious 10 program. A total of 918 polymorphic positions were found. From this, 75 were present in both Neandertals and present day human and 4 in Neandertals, ancient and modern humans. 175 polymorphic positions were exclusive to Neandertal sequences and 11 were present only in ancient human sequences. The remaining changes were present only in modern humans. Variants present in either Neandertals and modern humans or Neandertals, ancient and modern humans (79 positions) were screened for disease associations at MitoMap (http://www.mitomap.org/MITOMAP (28).
Position specific similarities between modern haplogroups and Neandertals were depicted by cladograms for each of the single 75 variant positions present only in Neandertals and modern humans and excluding Ancient H. sapiens, were generated using parsimony heuristic search implemented in PAUP v4.1a152 with the default parameters (29). Proximity of mitochondrial haplogoups in Ancient H. sapiens and Neandertals were inferred using Haplogrep 2.1.0 (30).
Potential recombination between Neandertals and ancient H. sapiens sequences was inferred by a phylogenetic based method implemented by manual bootscan in the Recombination Detection Program (RDP) v.4.87. Parameters for bootscan analysis were: window size = 200; step size = 20; bootstrap replicates = 1,000; cutoff percentage = 70; use neighbor joining trees; calculate binomial p-value; model option = Kimura 1980 (31). For each analysis, a single alignment was created which included the modern haplogroup, all 9 Neandertal and all 6 Ancient H. sapiens sequences. When rCRS was used as query, two sets of possible parental sequences were selected: either Neandertals Mezmaiskaya and Altai and ancient H. sapiens Fumane and Ust Ishim or only Neandertals Feldhofer1, Mezmaiskaya and Vindija 33.16. For haplogroups L0d1a and L3d3b possible parental sequences were Neandertals Feldhofer1, Mezmaiskaya and Vindija 33.16 and Ancient H. sapiens Kostenki 14, Fumane, Doni Vestonice 14 and Tianyuan. For haplogroups M29a and R0a possible parental sequences were Neandertals Mezmaiskaya and Altai and Ancient H. sapiens Kostenki 14 and Doni Vestonice 14. For haplogroup N1b1a3 possible parental sequences were Neandertals Feldhofer1 and Vindija 33.16 and Ancient H. sapiens Kostenki 14 and Doni Vestonice 14.
Variants calling and Principal Component Analysis of mitogenomes. Three different datasets were used for the variant calling: (1) the whole mitogenome from the 68 sequences alignment; (2) the 128 to 315bp fragment and (3) the 6,950 to 7,660 fragment of the same alignment. All fasta alignments were processed using the MSA2VCF software to generate the VCF files (32). The options used on msa2vcf were: ‐‐haploid ‐‐output. To convert the VCF files to Plink format we used the vcftools package (33). Whole mitogenome alignment with 68 sequences had 785 SNPs (positions containing gaps in at least one sequence were excluded from the analysis). Both 128 – 315 and 6,950-7,660 fragments had 24 SNPs.
Principal component analysis was performed using the PLINK software v1.90b4 (34). PCA figure plotting was made using Genesis PCA and admixture plot viewer (http://www.bioinf.wits.ac.za/software/genesis/). The first two principal components were chosen for the Neandertal - H. sapiens comparison.
Acknowledgements
The authors thank Prof. Dajiang Liu (Dept. Public Health Sciences, Penn State College of Medicine) and Prof. Rongling Wu (Center for Statistical Genetics, Penn State University) for critical reading of the manuscript. This work was supported by grants to M.R.S.B. from FAPESP, Brazil (2013/07838-0 and 2014/25602-6) and CNPq, Brazil (303905/2013-1). R.C.F received a CNPq postdoctoral fellowship (206445/2014-8) and C.R.R. a CAPES (Brazil) MSc fellowship.
References
Supplementary Data References of Table 1
- 42.↵
- 43.
- 44.
- 45.
- 46.
- 47.
- 48.
- 49.
- 50.
- 51.
- 52.
- 53.
- 54.
- 55.
- 56.
- 57.
- 58.
- 59.
- 60.
- 61.
- 62.
- 63.
- 64.
- 65.
- 66.
- 67.
- 68.
- 69.
- 70.
- 71.
- 72.
- 73.
- 74.
- 75.
- 76.
- 77.
- 78.
- 79.
- 80.
- 81.
- 82.
- 83.
- 84.
- 85.
- 86.
- 87.
- 88.
- 89.
- 90.
- 91.
- 92.
- 93.
- 94.
- 95.
- 96.
- 97.
- 98.
- 99.
- 100.
- 101.
- 102.
- 103.
- 104.
- 105.
- 106.
- 107.
- 108.
- 109.
- 110.
- 111.
- 112.
- 113.
- 114.
- 115.
- 116.
- 117.
- 118.
- 119.
- 120.
- 121.
- 122.
- 123.
- 124.
- 125.
- 126.
- 127.
- 128.
- 129.
- 130.
- 131.↵