The Siberian Brotherhood of Mankind
Once upon a time there were two brothers. Well, actually there were a lot more of them, but these two were very special: they were destined to become the most prolific progenitors of the world. Their offspring divided in two. Those that went east to cross the Bering Strait, to become the most prominent inhabitants of the New World as far as Tierra del Fuego, were the descendants of one brother, while the numerous descendants of the other brother that remained hither caused a shock-wave in the Old World that could be felt as far as West Africa, the Atlantic and India. Like in a fantasy tale, both people founded incredible civilizations, each highlights of human talent and splendor. Their feats were almost unequaled and competing – even though they were separated by two mighty oceans. Each completely forgot about the existence of the others, until the people on the hither side learned how to tame the waves and crossed the oceans. Their final reunion should have been a reason for joy and celebration but when they met again, 500 years ago, oh pitiful wretches! Moans of distress filled the air and blood soaked the earth. The once great people of the New World never recovered the blow of what could probably qualify in history as the most tragic example of fratricide.
In the genetic tree these two people are nowadays known as members of the R-clade and the Q-clade. The offspring of the multitudes of other brothers and close patrilinear relatives that couldn’t catch up with their success, dwindled, but some of them survived in fringe areas on both sides of the oceans. For convenience sake these people are considered members of the same ancestral clade as their fathers, the P-clade. Geneticists would easily recognize the Y-DNA markers of each clade, and say that their respective haplogroups were R, Q or ancestral P. Together they form a superclade on a single branch of the genetic tree. They make up a substantial part of the human population, but still they represent nothing but a single branch that on a paleolithic scale can’t be considered very ancient.
Where did those very successful P-derived haplogroups originate? The story of the two brothers may be an undue simplification since the full development of each clade involved nomads that probably roamed wide swaths of territory during many generations. The Old World was subject to an accumulated prehistory of up to a million years in every single corner, what tend to render our information for reconstructing the relevant demographic events almost intangible. Their legacy doesn’t include a unified language, not even overall genetic uniformity. Their genetic “kind” remains essentially restricted to the male lineage, that never became numerous in most of Africa and remained virtually absent in Oceania and the Far East. The Near East traditionally harbored a rich variety of their male genes in all stages of development, but so far nothing points at a historic predominant position in the region in comparison to other clades that eg. include notorious “stayers”, defined by haplogroup markers like Hg J and Hg E. These current distribution issues hardly give any clue about the origin, especially since more often than not the trail leads to desparate isolated and remote areas, or otherwise to historically cosmopolitan areas that received too much foreign visitors to reveal unambiguous answers. The only security we have is that on a genetic scale those three clades were exceptionally close related along the male lineage and thus that the incredible success of two P-derived branches may point to something systematic underneath, that also applies to the virgin soils of the Americas.
In stark contrast to the abysmal disagreements that surrounds our knowledge about the peopling of the Old World, the disagreements that concern the Americas are almost negligible. The Q-clade entered through the Bering Strait in one or two waves and were the first human beings to do so. They are theorized to originate in southern/central Siberia:
Globally, Y-chromosome data therefore emphasize the critical role of southern/central Siberia in the peopling of the Americas, since this region appears to be at the origin of two major male migratory waves of colonization. The data presented here are also consistent with the intriguing possibility of ancient links between proto-Europeans and proto–Native Americans, an idea that has been put forward in previous Y-chromosome studies (Karafet et al. 1999; Santos et al. 1999; Wells et al. 2001; Lell et al. 2002). An ancestral connection between these groups has also been suggested on the basis of morphological (Brace et al. 2001) and mtDNA (Brown et al. 1998) data and could ultimately trace back to ancient east/west human dispersals from a common source in central Asia. In agreement with this scenario, the P-M45 lineage has been found to be oldest in central Asia (Wells et al. 2001; Zerjal et al. 2002), where the Tuvan population includes haplogroups M242, M45, M173, and Tat, which are now dispersed in Europe and/or America. (Bortolini et al., 2004)
More recent migrational events, of new people that never succeeded to be equally successful in entering America, caused their kind to become rare in East Siberia, thus cutting the umbilical cord that once must have interconnected the great P derived clades on both sides of the Bering strait. We can’t easily derive from facts the precise events that should have linked the clades mentioned above with proto-europeans, but the link to proto-Native Americans is virtually straightforward and unequivocal. This security may serve as a sound base to analyze and verify the processes involved in deeper detail. Let us depart from the general agreement, that concern an origin from Central Asia and widespread habitation of the Americas during the end of the last glacial period, known as the late glacial maximum, around 16,000 — 13,000 years before present. How should this affect our assumptions that link medium term migrational processes to the origin of languages and genetic variability, and how the corresponding observations should contribute to our understanding of what happened elsewhere, ie. in the Old World, including Europe?
First of all, let us consider the pre-columbian era. Native Americans feature a seemingly disproportionate amount of linguistic groups and isolates, that at first sight hardly agree with the homogeneous origin implied by one or two migrational events from the same direction. Only the Na-Dené languages of North America, that include speakers of Athabaskan languages like Apache, seem to find a distant relative in the virtually extinct Ket language of Central Asia – that indeed feature the Q-haplogroup in remarkable high proportions. The languages are tentatively grouped together in the Dené-Yeniseian language family. Possibly the proto-Native Americans migrated wholesale to the Americas and didn’t leave much of a trace in Asia behind, or the descendants of their relatives that once spoke related languages in Asia became overwhelmed by groups that entered later into Siberia from the southeast, from the south or from the west across the Ural mountains. Both scenarios may apply, especially considering the advance of additional haplogroups as far as the Pacific coast of East Siberia that remain strikingly rare or absent in the Americas, like Hg N, several ancestral Hg C varieties and even Hg D. We don’t know about American Hg R, since up to now the ample presence of this haplogroup among virtually all Amerindian groups has been interpreted as recent European admixture. As a side-note could be mentioned that claims of Amerindian Hg R are indeed made and may find some confirmation in STR mutations on the Y-chromosome whose statistical properties deviate significantly from European Hg R (eg. less than 10% DYS390=23 among Native-American Hg R individuals, what plainly contradicts a West-European, and especially a Northwest-European origin). However, the general picture is that the migrational events that are responsible for the Amerindian presence were dominated by a small group of Siberian people whose males belonged especially to the Q-clade. Whatever the genetic and linguistic variety of Amerindian people nowadays, somehow this must relate to this potentially quite homogeneous people that arrived from the Arctic.
This purported ancestral homogeneity, barely 15.000 years ago, is confirmed by low intra-population variety in South America, but how this could be reconciled with the maximum genetic distance with respect to the rest of the world?
(2) Consistently with previous studies, South Amerindians and Oceanian populations show the lowest intra-population diversity when compared with autochthonous populations from other continents. (3) Within South America, Western South Amerindian populations show the highest intra-population diversity, consistently with an evolutionary model previously proposed by us, which suggest a higher long term effective population size and levels of gene flow among them. (4) Comparing the different continents, the between-population diversity is clearly the highest for South Amerindians (Fst = 0.19). This value is almost twice the level of differentiation observed for the worldwide population (Fst =0.11).(E. Tarazona-Santos, 2009)
Fst (Fixation Index) measures random admixture levels but whatever this may imply, this extreme South American value can’t be due to mere ancientness of the population. On a world-wide scale, the globally linear increase of Fst with geographic distance doesn’t confirm Heyerdahl-style immigration from weird directions and rather indicate other, more sofisticated processes are at work. The geographic position at the end of all migration lines, ie. more so if indeed traversing the Bering Strait, and a corresponding distance-driven isolation against the effects of geneflow may offer a more plausible explanation.
Could we hypothesize similar causes behind the incredible variety of Amerindian languages? Simple language drift may be responsible. This mechanism defines language as vocabulary cast into the mold of a particular syntax, subject to change. Over time the basic structure of the sentence, held together by functional items (~grammar), with the lexical items (~words) filling in the blanks, will thus be deeply modified, together with the “outer appearance” of a particular language, and affect grammar in all its (morphological and syntactic) aspects. The process may be gradual, the product of chain reactions or subject to cyclic drift, and causes the development of completely different linguistic characteristics of originally related languages that can’t be explained by the mere development in isolation of vocabulary.
In the Americas – including the most southern parts – we are thus confronted with a native population that exceeds the arctic origin location in linguistic and genetic diversity. This is completely contradictory to the common wisdom that homogeneous populations are younger and derived from populations that feature more diversity! So let’s return to the Old World and investigate how this insight might shed new light at the origin of the closely related R-clade.
The descendants of that other brother, that was the progenitor of the R-clade, are difficult to define. If we choose to ignore the possibility that some might have followed their kin of the Q-clade to the Americas through Siberian, we can observe that the R-clade essentially has a more western and southern distribution. The clade is subdivided in several groups that rarely escaped the attention of those that sought an Indo-European association. An unambiguous exception to the Indo-European bias was provided by the Westafrican subclade defined by mutation R1b-V88, that instead was related to the Afrosasiatic linguistic group. More specifically, the Chadic subgroup:
The analysis of the distribution of the R-V88 haplogroup in 41800 males from 69 African populations revealed a striking genetic contiguity between the Chadic-speaking peoples from the central Sahel and several other Afroasiatic-speaking groups from North Africa. The R-V88 coalescence time was estimated at 9200–5600 kya, in the early mid Holocene. We suggest that R-V88 is a paternal genetic record of the proposed mid-Holocene migration of proto-Chadic Afroasiatic speakers through the Central Sahara into the Lake Chad Basin, and geomorphological evidence is consistent with this view. (Cruciani et al., 2010)
Since this subgroup of the R-clade doesn’t derive from the subclades that are most frequently indicated as “Indo-European”, the simple conclusion should be that the R-clade is much older than the Indo-European language family and hence that the success of the Hg R-clade can’t be explained by the Indo-European advance, that indeed has been dated a lot later.
The origin of another important subgroup was recently associated with the Neolithic wave of advance, that is theorized to have started in Turkey:
Does the time to the most recent common ancestor (TMRCA) of the hgR1b1b2 chromosomes support a Paleolithic origin? Mean estimates for individual populations vary (Table 2), but the oldest value is in Central Turkey (7,989 y [95% confidence interval (CI): 5,661–11,014]), and the youngest in Cornwall (5,460 y [3,764–7,777]). The mean estimate for the entire dataset is 6,512 y (95% CI: 4,577–9,063 years), with a growth rate of 1.95% (1.02%–3.30%). Thus, we see clear evidence of rapid expansion, which cannot have begun before the Neolithic period. (Balaresque et al., 2010)
The debate on the origin of virtually all subclades of haplogroup R was repeatedly derailed by the discovery of ancestral samples on disparate locations. Especially the Middle East proved itself a valuable source, to the effect that previous assertions that pointed to Central Asia lost much of their attraction. However, drawing our lessons from the Amerindian situation we could wonder if this change of mindset is justified. A Russian publication (excerpts translated by Google) on the genetic variety of the Bashkir people, just south of the Ural mountain, will possibly make the difference. This study confirms that Bashkir R1b1b2 (also written as R1b-M269), otherwise especially associated to the Neolithic and Europe, might be very ancient in the region, notwithstanding the exceptionally low diversity.
Some researchers believe that haplogroup R-M269 was spread throughout Western Eurasia in the Upper Paleolithic [Semino et al 2000; Al-Zahery et al. 2003]. The high frequency of the paternal line in populations of the Southern Urals is an unexpected finding, as this haplogroup is not typical for either allied areas (Central Asia, Eastern Europe and Siberia). In order to clarify the origin of this haplogroups in on the southern Urals, we analyzed the phylogenetic relationships between microsatellite haplotypes in populations of Bashkir and West Asia, South Asia and Balkans, Europe (Fig. 4). As a result, phylogenetic analysis identified three clusters microsatellite haplotypes, designated as α, β and γ (Table 6, Fig. 4). It was evident that the bulk of the β cluster haplotypes corresponds to the populations of Europe (50 out of 70 haplotypes cluster β), while the majority of haplotypes in the cluster α (65 of 79 haplotypes) are from South and West Asia.
It should be noted that the largest cluster of the phylogenetic tree, cluster γ haplotypes, occurs equally in all regions (Europe, Balkans, Asia and South Urals). Over 70% of the haplotypes among the Bashkir belong to this cluster, which apparently is the result of the earliest stage (Upper Paleolithic) resettlements of mutation M269 carriers, that cover a larger area.
The inevitable conclusion:
Relatively low population density, and hence the effective population of this region compared to the densely populated regions West Asia, Europe, the Balkans and South Asia prevented the accumulation of high haplotype diversity.
This approach is essentially different from deducting the origin of a subclade – or clade – by only taking in consideration the diversity of surviving haplotypes. Naturally, at the benign living conditions of old cultural areas a high variance can be “accumulated”, when “fossile” haplotypes are “collected” over a large period of time. In the harsh climate of Siberia, however, the living conditions are completely different, and also the survival conditions and subsequent “pruning” of haplotypes. Then the alternative approach for retrieving age should thus consider the diversity in the widest sense, ie. of reminiscent clusters that can be compared with extant clusters on a world wide scale. I gather the “collector” function eg. of the middle east cultural area for “old” R-clade haplotypes has a biological analogy in the jungle, where you can find an enormous variety of species, including “living fossils”. I don’t think this should be explained by a theory saying that all living creatures evolved in the tropics. Variance and diversity of Y-DNA genes thus shouldn’t be taken as an unequivocal indication of origin.
Bashkir R also includes the R1b1b1 subclade, whose Central Asiatic antecedents are rarely disputed, and R1a (R – SRY10831.2). Accordingly, the study concludes:
Since there is no trace of mass migrations from Europe and western Asia, the reasons for the predominance of the main haplogroups (R – SRY10831.2 and R-M269) in the southern Urals are to be found in the processes of early settlement this region.
The question how the R-clade became involved in the Neolithic expansion and how it also reached Africa will be the sequel of this issue. At the moment it will suffice to point at a possible relation of the Bashkir homelands with the archeological Botai culture, the most likely place where the horse was domesticated and where the first consumption of (mare)milk was attested. I already mentioned the potential association between the R-clade and gene T-13910 for lactase persistency (including West Africa), what might be especially interesting because Enattah et al. (2007) already located the origin of this gene in the neighborhood of the Ural mountains. Consequently, mitochondrial haplogroup mtDNA-U5b and the autosomal gene associated to Bloodtype B may serve as valuable markers to locate Holocene migrations to West Africa, while the Neolithic wave of advance as a medium for the expansion of the R1b-clade into Europe – due to the low occurence of such migrational markers on the Neolithic route – could thus be confirmed as essentially characterized by geneflow.
How this processes might have affected also the spread of R1a and R2, the other members of the R-clade, is yet another chapter, being most probably intertwined with that Indo-European question. The issue of the Two Brothers, however, should be considered without linguistic or cultural implications. The issue of this Siberian Brotherhood is far more important, since the success of the arctic brothers can’t be a coincidence. For sure this issue touches the evolutionary pressures that are especially in force at the conditions of a harsh environment. Hence this issue directly touches our existence. We would never have existed without. Everybody already received all the benefits, since powerful geneflow around the globe continuously forges our species into a single, indivisible brotherhood of mankind.
- Lobov Artem – Structure of the Gene Pool of Bashkir Subpopulations, 2009, Russian link, Translated excerpt
- Bolnick et al. – Asymmetric Male and Female Genetic Histories among Native Americans from Eastern North America, 2006, link
- Lell et al. – The Dual Origin and Siberian Affinities of Native American Y Chromosomes, 2002, link
- Zegura et al. – High-Resolution SNPs and Microsatellite Haplotypes Point to a Single, Recent Entry of Native American Y Chromosomes into the Americas, 2004, link
- Bortolini et al. – Y-Chromosome Evidence for Differing Ancient Demographic Histories in the Americas, 2003, link
- E. Tarazona-Santos et al. – The genetic structure of Native Americans: inferences from SNPs in genes involved in carcinogenesis, immunity and pharmacogenetics, 2009, link
- Balaresque et al. – A Predominantly Neolithic Origin for European Paternal Lineages, 2010, link
- Cruciani et al. – Human Y chromosome haplogroup R-V88: a paternal genetic record of early mid Holocene trans-Saharan connections and the spread of Chadic languages, 2010, link (paysite)
- Enattah et al. – Evidence of Still-Ongoing Convergence Evolution of the Lactase Persistence T-13910 Alleles in Humans, 2007, link
- Ted Goebel et al. – The Late Pleistocene Dispersal of Modern Humans in the Americas, 2008, link