Abstract
Farming was established in Central Europe by the Linearbandkeramik culture (LBK), a well-investigated archaeological horizon, which emerged in the Carpathian Basin, in today’s Hungary. However, the genetic background of the LBK genesis has not been revealed yet. Here we present 9 Y chromosomal and 84 mitochondrial DNA profiles from Mesolithic, Neolithic Starčevo and LBK sites (7th/6th millennium BC) from the Carpathian Basin and south-eastern Europe. We detect genetic continuity of both maternal and paternal elements during the initial spread of agriculture, and confirm the substantial genetic impact of early farming south-eastern European and Carpathian Basin cultures on Central European populations of the 6th-4th millennium BC. Our comprehensive Y chromosomal and mitochondrial DNA population genetic analyses demonstrate a clear affinity of the early farmers to the modern Near East and Caucasus, tracing the expansion from that region through south-eastern Europe and the Carpathian Basin into Central Europe. Our results also reveal contrasting patterns for male and female genetic diversity in the European Neolithic, suggesting patrilineal descent system and patrilocal residential rules among the early farmers.
Author Summary We report an exceptional large Neolithic DNA dataset from the Carpathian Basin, which was the cradle of the first Central European farming culture, the so called Linearbandkeramik culture. We generated 9 Y chromosomal and 84 mitochondrial DNA profiles from Mesolithic and Neolithic specimens from western Hungary and Croatia, attributed to the hunter-gatherers, Starčevo and LBK cultures (7th/6th millennium BC). We observe genetic discontinuity between Mesolithic foragers and early farmers, and genetic continuity between farming populations of the 6th-4th millennium BC across a vast territory of southeastern and Central Europe. Nine novel Y chromosome DNA profiles offer first insights into the Y chromosome diversity of the earliest European farmers, and further support the migration (demic diffusion) from the Near East into Central Europe along the Continental route of Neolithisation. The joint analyses of the two uniparental genetic systems let us conclude that men and women had a similar roles in the Early Neolithic migration process but their dispersal patterns were determined by sex-specific rules.
Introduction
Agriculture was first established in the Near Eastern Fertile Crescent after 10,000 BC and expanded from the Levant and Anatolia to south-eastern Europe [1]. Archaeological research has described the subsequent spread of Neolithic farming into (and throughout) Central and south-western Europe along two major and largely contemporaneous routes. On the Continental route, the Carpathian Basin connected south-eastern Europe to the Central European loess plains, while the Mediterranean route bridged the eastern and western Mediterranean coasts, introducing farming to the Iberian Peninsula in the far West [2–6].
On the Continental route, the Early Neolithic Starčevo culture (STA) has played a major role in the Neolithisation of south-eastern Europe. The STA expanded from present-day Serbia to the western part of the Carpathian Basin, encompassing the regions of today’s northern Croatia and south-western Hungary (ca. 6,000-5,400 BC) [7,8] (Figure 1), and resulting in the formation of the Linearbandkeramik culture (LBK) [9]. The earliest LBK emerged in the mid-6th millennium BC in Transdanubia (called “LBK in Transdanubia”, or LBKT) [9], marking the beginning of sedentary life in northern Hungary and via this area, Central Europe. The earliest LBKT coexisted with the STA in Transdanubia for about 100-150 years [10]. Archaeological research described an interaction zone between indigenous hunter-gatherer groups and farmers at the northernmost extent of the STA in Transdanubia, which might have led to the genesis of the LBKT [10,11]. After its formative phase in western Hungary, the LBK spread rapidly to Central Europe, reaching central Germany around 5,500 BC [2,12]. In the following 500 years, the LBK continually expanded, eventually covering a vast geographic area from the Paris Basin to Ukraine in its latest phase [2,13], and persisted in Transdanubia until ∼4,900 BC (Figure 1).
Despite the well-established archaeological relations between the STA and LBKT in the Carpathian Basin and the LBK in Central Europe, their genetic relationship has hitherto been unknown. Traditionally, scholars have explained the Neolithic transition either as an expansion of early farmers from the Near East, who brought new ideas as well as new genes (demic diffusion) [14–17], or as an adoption of farming technologies by indigenous hunter-gatherer populations with little or no genetic influence (cultural diffusion) [18–21]. These two contrasting models have been merged into complex integrationist approaches, considering small-scale population movements on regional levels [1,2,10,22].
Inferences drawn from genetic studies based on present-day data have yielded contradictory results about the Neolithic impact on the genetic diversity of modern Europeans, showing a disparity between mitochondrial DNA (mtDNA) and Y chromosomal patterns. Several Y chromosome studies supported the Neolithic demic diffusion model [17,23,24], while most mtDNA and some Y chromosomal studies have proposed a continuity of Upper Palaeolithic lineages [20,21,25,26]. The contrasting mtDNA and Y chromosomal evidence has been explained by differences in evolutionary scenarios, such as sex-biased migration [27].
Recent ancient DNA (aDNA) studies have provided direct insights into the mtDNA and autosomal diversity of hunter-gatherers in Europe [28–33] and the Central European LBK [33–37], describing a clear genetic discontinuity between local foragers and early farmers [28,31,36]. Comparative analyses with present-day populations have revealed Near Eastern affinities of the mitochondrial LBK ancestry supporting the demic diffusion model and population replacement at the beginning of the Neolithic period [36,37]. Data on Y chromosomal diversity in Neolithic Europe is still scarce. Beside the recently described first Mesolithic and Neolithic hunter-gatherer Y chromosomal data [33,38], Y chromosome data have been reported from a few LBK samples [36], the Tyrolean Iceman [39], the southwest European Neolithic [40,41], and from the Late Neolithic Central Germany [42,43].
The postulated Near Eastern origin of Central Europe’s LBK farmers has so far only been inferred from modern-day population data. The first ancient mtDNA data from early Near Eastern farmers has been reported recently [44], however the genetic diversity in the vast territory from the Fertile Crescent to Central Europe has been largely unexplored. Consequently, our aims were to i) study the genetic diversity of the early farming Carpathian Basin cultures from both the mtDNA and Y chromosome perspectives, ii) examine whether men and women had different demographic histories, iii) investigate the contribution of the STA to the genetic variability of the LBKT and LBK, iv) reveal the potential genetic origins of the first farmers in Eurasia, and v) to assess the role of the Continental route in the European Neolithic dispersal.
In this study we present 84 mtDNA and 9 Y chromosomal DNA data from Mesolithic (6,200-6,000 BC), and Neolithic specimens of the STA and LBKT from western Hungary and Croatia, spanning ∼900 years (ca. 5,800-4,900 BC) of Neolithic period. The population genetic analysis allowed detailed insight into the role of archaeological cultures from the Carpathian Basin in the spread of farming from the Near East.
Results
Mitochondrial DNA
Using well established aDNA methods (Material and Methods), we genotyped mtDNA variability by sequencing the hyper-variable segment I and II (HVS-I/II) and 22 single nucleotide polymorphisms (SNPs) on the coding region of the mitochondrial genome [36]. Overall, we investigated 109 skeletons from one Mesolithic, six STA and eight LBKT sites from western Hungary and Croatia (Figure 1, Dataset S1-S2). We successfully genotyped endogenous HVS-I sequences of 84 individuals (hunter-gatherer=1, STA=44, and LBKT=39) yielding a success rate of 76% (Dataset S3). We also sequenced parts of HVS-II from 25 individuals with consistent and identical HVS-I motifs in order to increase the phylogenetic resolution and to detect potential intra-site maternal kinship. The analysis of haplogroup defining coding region SNPs provided reproducible profiles for 96 individuals, with a success rate of 86% (Dataset S3-S4).
The haplotype of the Mesolithic skeleton from the Croatian Island Korčula belongs to the mtDNA haplogroup U5b2a5 (Dataset S3). The sub-haplogroup U5b has been shown to be frequent in pre-Neolithic hunter-gatherer communities across Europe [28–30,32,33,45,46]. Contrary to the low mtDNA diversity reported from hunter-gatherers of Central/North Europe [28–30], we identify substantially higher variability in early farming communities of the Carpathian Basin including the haplogroups N1a, T1, T2, J, K, H, HV, V, W, X, U2, U3, U4, and U5a (Table 1). Previous studies have shown that haplogroups N1a, T2, J, K, HV, V, W and X are most characteristic for the Central European LBK and have described these haplogroups as the mitochondrial ‘Neolithic package’ that had reached Central Europe in the 6th millennium BC [36,37]. Interestingly, most of these haplogroups show comparable frequencies between the STA, LBKT and LBK, comprising the majority of mtDNA variation in each culture (STA=86.36%, LBKT=61.54%, LBK=79.63%). In contrast, hunter-gatherer haplogroups are rare in the STA and both LBK groups (Table 1). Besides similar haplogroup compositions we also found comparable haplotype diversity values for each culture (STA=0.97674, LBKT=0.95277, LBK=0.95483).
In order to evaluate whether the haplogroup and haplotype composition of the STA, LBKT, LBK [34–37] and hunter-gatherers from Central/North Europe [28–30] differ significantly from each other, we performed a haplogroup-based Fisher’s exact test and a sequence based genetic distance analysis. In addition, we used the test of population continuity (TPC) [37], to elucidate whether the observed differences can be best explained by genetic drift or by other factors such as migration. These analyses reveal that the mtDNA composition of the Early Neolithic cultures is significantly different from that of the hunter-gatherers, both on the haplogroup (p=0.0001) and haplotype level (Fst= 0.17989-0.18810, p=0.0000) (Table 1), indicating genetic discontinuity of maternal elements at the advent of farming in the Carpathian Basin as it has been reported previously from Central Europe [28,36,37]. The TPC shows that independent of the tested effective population size, the transition from hunter-gathering to farming cannot be explained by genetic drift alone (p<0.000001, Dataset S11). More importantly, non-significant differences between the haplogroup (p=0.06829-0.5574) and haplotype composition (Fst=-0.00518-0.01343, p=0.21072-0.60608) of the STA and the LBK groups from Transdanubia and Central Europe (Table 1) support a rather homogenous mtDNA signature of early farming communities from both regions. The TPC also supports the scenario of population continuity during the Neolithic period, showing no significant p values among the pairwise compared Neolithic cultures (p>0.177 with all tested effective population sizes, Dataset S11).
We combined our Neolithic samples from the Carpathian Basin with 487 published mtDNA data from Upper Palaeolithic and Mesolithic [28–30,32,45,46], Neolithic [34–37,40,42,43,45–47] and Early Bronze Age [37] sites across Europe (Dataset S6) and conducted principal component analysis (PCA), multidimensional scaling (MDS), analysis of molecular variance (AMOVA) and shared haplotype analysis to compare the mtDNA variability of the STA and LBKT in a broader geographical and chronological context (Material and Methods).
PCA and MDS show that the mtDNA makeup of the STA and LBKT is strikingly similar to the LBK [34–37] and to subsequent cultures of the 5th/4th millennium BC in Central Europe [37] (Figure 2-S1, Dataset S7-S8). This is predominately based on a high number of ‘Neolithic package’ lineages and low frequencies of haplogroups attributed to hunter-gatherers, which clearly distinguish this cluster from hunter-gatherers of Central/North [28–30] and southwest Europe [32,45,46], but also from Neolithic Iberian populations and Central European cultures of the 3rd/2nd millennium BC (Figure 2). In order to exclude biases induced by potential maternal kinship within the prehistoric datasets, we performed PCA and MDS with a reduced dataset (*) as well, in which redundant haplotypes with identical HVS-I and II sequences from the same site were omitted. The reduced datasets have similar locations on the plots to the complete datasets, indicating that the effect of maternal kinship is negligible.
We used AMOVA to evaluate whether the observed affinities of STA and LBKT with the LBK and 5th/4th millennium BC cultures from Central Europe are the result of a shared population structure. We pooled HVS-I sequences from the STA and LBKT and nine archaeological cultures from Central Europe ranging from the LBK to the Early Bronze Age [34–37,42,43] into different groups, and tested 82 different arrangements to identify the constellation with the highest among-group variance and simultaneously with low variation within the groups (Dataset S9). The highest among-group variance was observed when STA and LBKT were arranged in one group with the Central European LBK and with all 5th/4th millennium BC cultures, while the 3rd/2nd millennium BC cultures were separated in a second group (among-group variation=3.50%, Fst=0.03501, p=0.00396; within-group variation=0.20%, Fst=0.00203, p=0.31139, Dataset S9). These results suggest a common genetic structure of the 6th-4th millennium BC cultures.
We used shared haplotype analysis [48] and modified this approach by accounting for the temporal succession of cultures (ancestral shared haplotype analysis -ASHA). This enabled us to ascribe mtDNA lineages to particular cultures or time periods according to their first appearance in the dataset in chronological order (Figure 3, Dataset S10), and to estimate the amount of ancestral lineages in each culture, potentially derived from hunter-gatherers, STA, LBKT, LBK or other subsequent cultures. The ASHA shows that ancestral hunter-gatherer lineages were rare in the STA (2.27%), LBKT (0%) and LBK (1.85%) as well as in 5th/4th millennium BC cultures (0%) and became more common in Central Europe during the 3rd/2nd millennium BC (2.86-11.76%) [37]. In contrast, we identified a high degree of ancestral STA lineages in all subsequent cultures (LBKT=61.54%, LBK=55.56%, 5th/4th millennium BC=36.84-63.64%, 3rd/2nd millennium BC=36.17-43.18%). The subsequent LBKT reveals a smaller distinctive influence on its successors, since only 12.96% of the LBK, 0-10.53% of the 5th/4th millennium BC, and 0-3.19% of the 3rd/2nd millennium BC cultures can be traced back to ancestral lineages first observed in the LBKT. The number of new ‘ancestral’ lineages is even lower in the LBK of Central Europe, with no effect on the 3rd/2nd millennium BC cultures.
In order to identify affinities of our Neolithic datasets with present-day populations, we collated 67,996 published HVS-I sequences from Eurasian populations and conducted PCA and genetic distance mapping (Material and Methods).
The PCA shows that the frequencies of N1a, T1, T2, K, J and HV, and the absence of Asian and African lineages in the Carpathian Basin cultures cause a clustering of the STA with populations of the Near East and the Caucasus, while the LBKT falls between the latter and populations from South and southeast Europe (Greeks, Bulgarians and Italians), which is caused by a higher frequency of haplogroup H in the LBKT (Figure S2, Dataset S12). However, the dominant frequencies of haplogroup N1a, T2, and K in the STA and LBKT result in a differentiation from all present-day populations along the third component.
Sequence-based genetic distance maps are largely consistent with PCA and reveal the greatest similarities of the STA to populations of the Near East (Iraq, Syria) and the Caucasus (Azerbaijan, Georgia, Armenia), as well as some European populations, such as Italy, Austria, Romania, and Macedonia (Figure 3a, Dataset S13). The distance map of the LBKT displays affinities that are overall similar to the STA, which includes populations from Azerbaijan, Syria, and Iraq. We also observe similarities to present-day Europeans, such as the populations of Great Britain, Portugal, Romania, Crete, and Russia (Figure 3b, Dataset S13). These similarity peaks are likely explained by elevated frequencies of shared lineages due to shared genetic drift in modern-day populations.
Y chromosomal DNA
We also analysed the non-recombining part of the Y chromosome (NRY) in the investigated samples, using multiplex [36] and singleplex approaches, targeting 33 haplogroup defining SNPs. We successfully generated unambiguous NRY SNPs profiles for nine male individuals (STA=7, LBKT=2) (Dataset S3, S5). Three STA individuals belong to the NRY haplogroup F* (M89) and two specimens can be assigned to the G2a2b (S126) haplogroup, and one each to G2a (P15) and I2a1 (P37.2) (Dataset S3, S5). The two investigated LBKT samples carry haplogroups G2a2b (S126) and I1 (M253). Furthermore, the incomplete SNP profiles of eight specimens potentially belong to the same haplogroups; STA: three G2a2b (S126), two G2a (P15), and one I (M170); LBKT: one G2a2b (S126) and one F* (M89) (Dataset S5).
G2a2b and F* are rare in present-day Europe. Haplogroup G and its subgroups slightly increase towards the Near East and reach the highest frequency in populations of the south and northwest Caucasus [49,50], while haplogroup F* shows a diffuse dissemination pattern in Eurasia, which is based on insufficient sub-haplogroup resolution of most of the population genetic studies. Haplogroups I1 and I2a1 are most frequent in present-day populations of Europe, with the highest frequencies in Scandinavian [51–53] and southeast European populations respectively [51].
We used PCA and genetic distance maps to identify affinities of the Carpathian Basin samples with 49,516 NRY SNP profiles from present-day Eurasian and African populations (Material and Methods). Due to the similarities in Y chromosome composition and the small number of samples, we pooled STA and both LBK groups.
The elevated haplogroup G frequency in populations of the west Caucasus results in a clustering with the STA-LBK group on the second principal component the predominant frequencies of haplogroups G and F* lead to a clear separation of the STA-LBK group from all present-day populations along the third principal component (Figure S4, Dataset S14).
Similarly, the Y chromosome distance map discloses the greatest similarities to populations of the west and south Caucasus, such as Adyghe, Kabardin, Balkarians, Abkhazians, Azerbaijanis and Georgians as well as to the Sardinians (Figure 4, Dataset S15), which can be explained by the high frequency of haplogroup G/G2a [50,54] in these populations. This might reflect genetic drift, caused by isolation and small effective population size after a direct gene flow from the Near East, which lead to a fixation of this haplogroup [49]. Intriguingly, populations of the northeast Caucasus show greater distances to the STA-LBK samples due to lower abundance of haplogroup G/G2a [50]. Recently, the genomic data of an LBK individual from Stuttgart has been shown to be similar to modern-day Sardinians [33], which result can be explained by the isolation of the Sardinians, leading to the conservation of the Neolithic genetic signature. Nevertheless, our mtDNA population genetic analyses did not assure the Neolithic-Sardinian affinity, detected only on the NRY genetic distance map.
Discussion
This study provides the first in-depth population survey of early farming cultures from the Carpathian Basin and south-eastern Europe and demonstrates their essential role in the genesis of the first farming communities of Central Europe. Our population genetic analyses (Fisher’s exact test, PCA, MDS, AMOVA, TPC) reveal a similar haplogroup composition and comparable haplotype diversity between the mtDNA variability of the Carpathian Basin cultures and the LBK from Central Europe (Table 1), indicating a homogenous and shared population structure of early farming communities from both regions (Figure 2, S1).
The ASHA shows that about 55% of the LBK lineages ascribed to characteristic ‘Neolithic package’ haplogroups could be traced back to the STA and LBK in Transdanubia (Figure 3, Dataset S10). It is therefore likely that this mtDNA signature was also present in ancient populations preceding the STA (7th/6th millennium BC farming groups from the Aegean and the southern Balkans), in accordance with the archaeological record, which suggests cultural links to regions further southeast [5]. Interestingly, the STA mtDNA signature was still preserved in Neolithic cultures of the 5th/4th millennium BC in Central Europe (Figure 3, Dataset S10), attesting a direct and enduring genetic legacy of the STA and LBKT in the Central European Neolithic, with minimal or no additional genetic influence from outside for the subsequent 2,500 years.
Importantly, our comparative analyses (PCA and genetic distance maps with modern population data) point out that both the mtDNA and NRY variability, observed in the Carpathian Basin samples, most likely originated in the Near East with connections to the Caucasus (Figure 4, S2-4), which is in accordance with previous mtDNA studies of the Central European LBK [36,37], and subsequent farming cultures of the 5th/4th millennium BC [37]. The continuation of lineages through space and time suggests a scenario in which the genetic makeup of early farmers originated in the Near Eastern Fertile Crescent, from where it spread to Central Europe via the western Carpathian Basin, a region which acted as a natural corridor and an adaptation zone during the Neolithic expansion. The shared Near Eastern affinities of the STA, LBKT and LBK, and the genetic continuity in the maternal and paternal gene pools are consistent with the archaeological record, which describes the genesis of the early LBK (LBKT) from STA communities, followed by a rapid dispersal of the early LBK culture from Transdanubia towards the north-western part of Central Europe [3,9,13]. Recent aDNA study from 8000 BC Near Eastern farmers raises the question whether modern Near Eastern mtDNA can be used as a proxy for the Near Eastern Neolithic variability [44]. In our opinion, these newly described seven different incomplete HVS-I haplotypes (np 16095-16369) only provide a limited basis for comparative aDNA analyses, and we thus still consider modern-day Near Eastern genetic data sufficient proxies, when tracing the origin of the first European farmers. Recent study using ancient genomic data of the ‘Stuttgart’ LBK individual, the Tyrolean Iceman, and a Scandinavian farmer (Gök4) has shown rather south European than Near Eastern affinity of these ‘early farmers’, and has estimated a western hunter-gatherer ancestry of 0-45% in the early farmers’ gene pool [33]. These results do not contradict ours, since uniparental markers behave more conservative. They could preserve Near Eastern signature more consistently, even if admixture with foragers occurred on the way to Central Europe. Furthermore the results are not directly comparable with ours, since we had used earlier Neolithic specimens from a region that was nearer to the source region than it was the case in the study by Lazaridis et al.
The very low frequencies of hunter-gatherer lineages (0-2.27%), in the STA, LBKT and LBK sample sets (Figure 3) indicate that the arrival of agriculture in the Carpathian Basin and Central Europe was accompanied by a strong reduction of the currently known Mesolithic mtDNA substratum, resulting in a distinct and contrasting mtDNA haplogroup composition and significant differences between European hunter-gatherers and the Early Neolithic cultures (Figure 2-3, S1, Table 1, Dataset S7-8, S10-11). This scenario is consistent with coalescent-based simulations that have revealed genetic discontinuity between Central European hunter-gatherers and LBK communities [28,36]. The detection of haplogroup U5b in the investigated Mesolithic skeleton from Croatia matches previous observations, which describe sub-haplogroups of U as most frequent in forager populations across Europe, forming a characteristic Mesolithic mtDNA genetic substratum [28,37]. Residual Neolithic hunter-gatherer isolates, as reported from Central Europe by Bollongino et al. [30], have not yet been observed in our study region. According to the low proportion of hunter-gatherer mtDNA lineages in the LBK gene pool, we assume, that admixture between hunter-gatherers and colonizing LBK farmers was negligible in Central Europe. Considering the relative size and speed of the LBK expansion, we have to assume a substantial population growth during the earliest LBKT, which might have resulted in a population pressure and led to emigration from Transdanubia [55]. While such a radical population increase was not palpable from the Early Neolithic archaeological records [7], but recent extensive archaeological excavations have provided new insights into large-scale early LBKT settlements in western Hungary [9,56,57], which suggest larger source communities for a possible colonization than previously assumed.
Y chromosomal population genetic studies of modern-day Europeans have proposed that I1 and I2a1 NRY haplogroups were present in Europe since the Late Upper Palaeolithic. This was based on consistently high divergence time estimates [51,58], suggesting an expansion from Franco-Cantabrian (I1) and southeast European glacial refugia (I2a1) after the Last Glacial Maximum [51]. I2a1 has been recently described in Mesolithic specimens from Loschbour (Luxemburg) and Motala (Sweden) [33], in a Scandinavian Neolithic hunter-gatherer from Ajvide (Sweden, 2,900-2,600 BC) [38], as well as in Neolithic remains of southern France and northern Spain [40,41]. From the Mesolithic Motola site a further three men could be assigned to the haplogroup I [33]. The fact that almost all Mesolithic males belong to haplogroup I suggests that this haplogroup might represent a pre-farming legacy of the NRY variation in Europe.
Y chromosome haplogroups from STA and LBKT samples, such as haplogroups G2a2b and F* have also been reported from the Central European LBK [36], and support a close genetic relationship of the paternal lineages. Genetic studies on modern-day populations have discussed haplogroup G [25,59] and its subgroup G2a as potential representatives of the spread of farming from the Near East to Europe [26]. This scenario has recently been supported by Neolithic data from northern Spain [40] and southern France [41], which attested G2a a pivotal role in the Neolithic expansion on the Mediterranean route. Furthermore, G2a has also been reported from the Tyrolean Iceman (G2a2a1b (L91)) [39]. Taken together, these findings suggest that sub-haplogroups of G2a were frequent in Neolithic populations of the 6th-4th millennia BC across Europe. Thus, if we take Y chromosomal haplogroup I2a (and possibly I1) as proxy for a Mesolithic paternal genetic substratum in Europe, we observe a similar pattern to the changeover in the mitochondrial DNA variability, in which NRY G lineages dominate Neolithic populations across Europe and I lineages become rare [36,39–43].
The most characteristic mtDNA haplogroup of early farmers from the Carpathian Basin and Central Europe is N1a. N1a has previously been discussed as a potential marker of the spread of farming [34]. The presence of N1a in early farmers from the Carpathian Basin (6.82-10.26%) and Central Europe (12.04%, Table 1) lends further support to its pivotal role as a marker for the Continental route of the Neolithic expansion. On the other hand, mtDNA N1a and NRY G2a haplogroups are rare in present-day European populations, which is also reflected in the separation of the 6th millennium BC cultures from all present-day populations along the third principal component on the PCA plots (Figure S2, S4). These findings indicate further demographic events after the Early/Middle Neolithic period that shaped modern-day mtDNA and NRY variability. Recent evidence from ancient mtDNA has described the formation of modern-day variability by several successive migration events in Central Europe during the 3rd/2nd millennium BC [37]. It is highly likely that these events have also affected the NRY diversity. Surprisingly, Y chromosome haplogroups, such as E1b1b1 (M35), E1b1b1a1 (M78), E1b1b1b2a (M123), J2 (M172), J1 (M267), and R1b1a2 (M269), which were claimed to be associated with the Neolithic expansion [23–25], have not been found so far in the 6th millennium BC of the Carpathian Basin and Central Europe. Intriguingly, R1a and R1b, which represent the most frequent European Y chromosome haplogroups today, have been reported from cultures that emerged in Central Europe during the 3rd/2nd millennium BC, while a basal R type has been reported from a Palaeolithic sample in Siberia [60] in agreement with a proposed Central Asian/Siberian origin of this lineage. In contrast, G2a has not been detected yet in late Neolithic cultures [42,43]. This suggests further demographic events in later Neolithic or post-Neolithic periods. However, we caution that the NRY record is still very small, especially in more recent periods, and further ancient Y data are required to shed light on the formation of the modern-day paternal diversity.
Interestingly, recent model-based statistical analyses of contemporary NRY and mtDNA data, testing a series of population scenarios for the Neolithic transition, have revealed a shared admixture history for men and women, but not the same demographic history [61]. This study has shown that female had a larger effective population size, likely based on differential effects of social and cultural practices including increasing sedentism alongside a shift to monogamy and patrilocality in early farmers. It is therefore important to interpret our new genetic data in the light of those findings. Considering the entire set of 32 published NRY records available for Neolithic Europe thus far, the low paternal diversity is indeed quite remarkable: G2a is the prevailing haplogroup in the Central European and Carpathian Basin Neolithic, and in French and Iberian Neolithic datasets [36,40,41]. There are only two exceptions, namely one E1b1b (V13) [41] individual from the Avellaner cave in Spain (∼5,000-4,500 BC), and two I2a [40] individuals from Treilles, France (∼3,000 BC). This very limited variation in NRY haplogroups in contrast to the high mtDNA haplogroups diversity suggests a larger effective population size for females than males. One plausible explanation for this phenomenon is patrilocality (where women move to their husband’s birth place after the marriage). Other possibilities that could lead to similar observations include polygyny or male-biased adult mortality. A patrilocal residential rule was possibly linked to a system of descent along the father’s line (patrilineality) in early farming communities. Ethnographic studies have suggested a change of residential rules at the advent of Neolithisation, showing different trends in residential rules among modern foragers and nonforagers [62]. Increasing sedentism promotes territorial defence and control of resources, favouring men in the inheritance of land and property, which consequently led to patrilocal residence [62]. At the same time, such residence pattern have to be momentarily flexible in expanding populations, allowing some of the sons to settle in new territories following population pressure and natural limitation of resources, e.g. after the carrying capacity of a particular region has been reached [61]. Patrilocality has also been raised in recent bioarchaeological studies. It has been suggested by aDNA evidence for the Treilles Neolithic community [41], and by stable isotope studies for the LBK in Central Europe [63].
It is important to note that patrilocality does not contradict the demic diffusion model, and it appears that both phenomena have left a discernible mark on the European Neolithic genetic diversity. While patrilocality and –lineality might have caused high mtDNA and low NRY within population diversity, the demic diffusion model best explains the mtDNA and NRY affinity of the early farmers to the modern Near East and Caucasus, and the observed global genetic homogeneity on a vast territory of south-eastern and Central Europe. Importantly, local processes of sex-biased migration are unlikely to have an effect on genetic variation at broader spatial scales. Our observations from many sites in Europe therefore argue for a common set of cultural and social practices across larger distances for early farming cultures in Europe. However, we caution that the observed differences in genetic diversity between males and females could also be influenced by resolution biases, resulting from the different sets of studied mtDNA and NRY markers. Examining sex-specific dynamics of early farmers is an important area that warrants further detailed research in order to address underlying parameters such as migration rate, level of exogamy and distances of marriage related dispersals among others.
The novel 83 mtDNA and nine NRY data from early farming Neolithic populations of the Carpathian Basin and one Mesolithic mtDNA profile help to fill the geographic gap on the Continental route of the Neolithic expansion from the Near Eastern Fertile Crescent to Central Europe. The joint analyses of mitochondrial and Y chromosomal DNA data support the demic diffusion of the early farmer men and women through western Hungary, and demonstrate the paramount importance of this region as a prehistoric corridor of the migration. We point out that archaeological cultures of the Carpathian Basin provided the genetic basis of the first Central European farmers that affected subsequent prehistoric cultures for a long period of time. Additionally, the new NRY data complement the sporadic European Y chromosomal dataset, and lend further support to patrilocal residential rules and patrilineal social system of the first farmers, underlining the role of demographic factors, which, depending strongly on cultural practices, notably shaped prehistoric and extant genetic diversity.
Material and methods
We sampled one Mesolithic, 47 Starčevo and 61 LBKT skeletons, excavated in Croatia and western Hungary. The ancient DNA work was carried out in the Institute of Anthropology at the Johannes Gutenberg University of Mainz, following well-established protocols [34,36,37,42]. For minor modifications in the procedure of HVS-I, II and coding region SNP typing of the mitochondrial genome and SNP typing of the Y chromosome see Supplementary Information and Dataset S2-4.
The achieved genetic results were evaluated by population genetic analyses, using comparative ancient and modern DNA datasets (see Dataset S6, S12-15). We performed Fst and AMOVA analyses in Arlequin 3.5.1 [48]. Furthermore, we conducted Fisher’s exact test, PCA and MDS in R software environment. PCAs were based on mtDNA and Y chromosome haplogroup frequencies (Dataset S7, S12, S14). MDS was based on Slatkin linearized Fst values, calculated from mitochondrial HVS-I sequences (Dataset S8). Haplotype diversity was computed in DnaSP software, in version 5.10.01 [64].
MtDNA HVS-I sequence data and Y chromosomal haplogroup frequencies were applied for the genetic distance calculation, comparing the STA, LBKT mitochondrial DNA and a combined STA-LBK Y chromosomal datasets with 130 and 100 modern populations respectively (Dataset S13, S15). Genetic distance maps from the Fst values were generated in ArcGis version 10.0. In the ASHA that is a modified approach of the shared haplotype analysis [48], each HVS-I lineage within a given cultural dataset was traced back to its earliest appearance in a defined chronological order of the studied cultures. Each was regarded either as ancestral or as a new lineage, receiving its name after the culture where it was detected earliest in time (Dataset S10, Figure 3).
The adjusted parameters of the TPC [37] (Dataset S11) as well as further points of each analysis are detailed in the Supplementary Information.
Author contributions
K.W.A., E.B., G.B., A.S-N. and J.J. designed the study; A.S-N. performed the palaeogenetic analyses; A.S-N., J.J., V.Ke., M.BG. and M.F. collected the samples; G.B. and A.S-N., collected reference data for the population genetic analyses; S.M-R. and A. S-N. cloned the PCR products; A.S-N., G.B., W.H. performed the biostatistics analyses; K.K., M. BG., B.Ő., G.T., E.M., G.P., M.Š., M.N. and N.P-Š. accomplished the anthropological analyses including individualization; K.O., T.M., V.Ki., A.O., K.S., A.C., V.V. K.So. and T.P. excavated the sites as well as provided the samples and the archaeological information and support; K.O., T.M., J.J. and A.S-N. summarized and reevaluated the archaeological data; B.K. made the radiocarbon dating; A.S-N., G.B., V.Ke., W.H. E.B. and K.W.A. wrote the paper; all authors discussed the results and commented on the manuscript.
Financial Disclosure
This study is part of a 3-year project funded by the German Research Foundation (DFG) aimed at investigating the population dynamics of the Neolithic Carpathian Basin (AL 287-10-1). The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
The project is a cooperation between the Bioarchaeometry Group of the Institute of Anthropology at the Johannes Gutenberg University of Mainz and the Institute of Archaeology, Research Centre for the Humanities, Hungarian Academy of Sciences in Budapest.
Acknowledgement
We thank Rozália Kustár, Olga Vajda-Kiss for archaeological information, Zsuzsanna K. Zoffmann for providing anthropological information and material, and Gabor Krizsma for informatics support.
Footnotes
Glossary
- aDNA
- Ancient DNA
- AMOVA
- Analysis of molecular variance ASHA Ancestral shared haplotype analysis
- HVS I/II
- Hyper Variable Segment I or II of the mitochondrial genome
- LBK
- Linearbandkeramik or Linear Pottery culture in Central Europe (refer to published LBK data from the Czech Republic, Lower Austria, and Germany)
- LBKT
- Linearbandkeramik or Linear Pottery culture in western Hungary/Transdanubia
- MDS
- Multidimensional scaling
- mtDNA
- Mitochondrial DNA
- np
- Nucleotide position
- NRY
- Non-recombining part of the Y chromosome PCA Principal component analysis
- SNP
- Single nucleotide polymorphism
- STA
- Starčevo culture
- TPC
- Test of population continuity