Archive for December 6, 2009

Y-DNA: How valid is the evidence for recent Out of Africa replacements?

December 6, 2009 Comments off

At first sight, the global distribution of Y-DNA haplogroups fairly well corresponds to an Out of Africa scenario of Human evolution. Older haplogroups, baptized  A and B, are virtually limited to Africa and apart from a deficiency or absence of African DE forms in comparison to Eurasia, there is a reasonable match with the pretty extreme Recent Out of Africa scenario that asserts modern man originated in Africa quite recently, driving all other contemporary human species into extinction.

A closer look at the distribution of Y-DNA haplogroups downstream F and even all of E in comparison to DE would suggest massive back migrations to Africa early in human prehistory. At least this would still leave haplogroups A and B as unequivocally African. Especially B is associated to primitive African tribes, while only minor amounts of A1 have been found outside Africa, so far only sporadically in NW Europe.

Microcephalin-1 distribution

The distribution of autosomal microcephalin-1, also referred to as Haplogroup D (not to be confused with Y-DNA haplogroup D), seems to support the concept of moderate backmigrations rather than an exclusive African development, accounting for all derived modern human genetic variation. Microcephalin-1 may have arrived as an archaic introgression into the Homo Sapiens genepool, having a possible genetic distance to non-microcephalin-1 haplotypes of 1.7 million years ago (Evans 2006). The evolutionary advantages above other types are disputed and probably non-existent, what would define the gene essentially as a neutral marker. According to autosomal calculations, the gene originated more or less contemporaneous to a time calculated when the most prolific Y-DNA haplogroups already branched off (37.000 years ago). The moderate microcephalin-1 values in Africa (~30%) could only be reconciled with Hg E backmigrations if by then microcephalin-1 was still absent in the  genepool of Hg E populations. This is plausible since according to Karafet Haplogroup E already branched off from DE at about 52,500 years ago.

Considering the character of microcephalin-1 as a fairly neutral autosomal marker, the introgression should have happened in a time of expansion and accelerated gene flow, thus in a direction generally outward with regard to Africa. However, the reconstructed succession of events would rather put the gravity of the modern human origin and expansion in Asia, well after the impetus of African expansion. A tendency of lower microcephalin-1 values in some SE Asiatic areas, that encompass tribal communities presenting reminiscent Hg D, would in the Recent Single Origin view attest the residue of the earliest African immigrants that expanded before microcephalin-1 was incorporated in their ancestral group. The hypothesized introgression thus seems to have happened elsewhere and much after these first expansions, most probably among Y-DNA CF groups. These in turn must have branched off from a common stock as DE as early as 65,000 years ago, about 30,000 years before the introgression. However, this would turn the premises of the Recent Single Origin hypothesis upside down since this would imply that African backmigrations already started before European Neanderthal was driven to extinction by modern human immigrants. The arrival of new groups west of the Iron Gates of the Donau is not confirmed to have happened before 32,000 years ago.

These dates and the wide range of CF haplogroups that attest a maximum association to microcephalin-1 would be contradictory to the neutrality of this gene. The ancestral group where introgression took place can hardly be reduced to a bottlenecked population where microcephalin-1 gained prevalence, unless this population already comprised a wide range of CF haplogroups on the eve of their expansion.  The neutrality of microcephalin-1 implies a migrational expansion and even in Africa we can’t find so much sub-saharan Y-DNA that derives from CF (C is virtually absent) as to propose a tight link to F-related backmigrations. Something appears to be wrong.

Part of the anthropological and autosomal genetic data seem to favour multiregionalist scenarios, though Y-DNA is often cited as the decisive counterargument. It has been tried to find old Hg A and Hg B (Y-DNA) elsewhere in the world to indicate the continuity with early humans. Still the occurrences found are virtually restricted to Africa. A complicated model of Y-DNA related geneflow could be devised to account for the current Y-DNA haplogroup distribution, though even a pretty strong CF related association to microcephalin-1 as an autosomal markers fails to support the hypothesis CF already branched off in Africa. Like DE, CF didn’t leave unequivocal traces behind in Africa and their common ancestor could have left Africa a long time before in the main Out of Africa event linked to modern humans in the Recent Single Origin hypothesis, that thus should be dated conform Karafet et al. at about 70 kya. The above implies that the expansion of modern humans out of Africa at most could have been a one wave event.

Assuming a correlation of microcephalin-1 to the distribution of modern Y-DNA, I am inclined to pinpoint the introgression geographically in the neighbourhood of East Turkestan. Neutrality should imply a pocket of related CF Y-DNA groups that passed through a bottleneck. I gather that an association between Australian C4-populations and microcephalin-1 still awaits verification. Otherwise, a non-neutral behavior of microcephalin-1 should be assumed, affecting the distribution of Y-DNA haplogroups D, C, E and F in different ways.

In a non-neutral scenario of the dispersion of microcephalin-1, the main agent to the expansion might have been still undifferentiated Hg F. Probably Hg C is slightly too old to have been much more than one of the intermediaries in the spreading event, Hg C3 being the most successful agent among the Hg C-subclades. The main agents, however, remain the Hg F subclade family. Unless Hg F derives from a parent Hg CF that is especially close to Hg C3, the picture that evolves in this hypothesis is of a tiny region probably in the neighbourhood of Turkestan where Hg C and Hg F can be inferred to have occurred all close together and well outside the range of Hg D and “maybe” Hg DE. An important expansion by geneflow of Y-DNA from this confined region could at least explain why most current Y-DNA groups are young and genetically related. However, it is still difficult to link a late CF expansion everywhere and to the same degree to the swamping of pre-CF groups other than DE or D. CF expansion thus appears congruent to more recent events that seem to be related to the habitation of new geographical niches by people that carried microcephalin-1, initially with their gravity in cold, northern climates.

Any close tie between microcephalin-1 and contemporary Y-DNA haplogroups should have dragged Hg F+ to the west (including ancestral Hg K2) and Hg C3 to the north, by migration or geneflow. To Hg D, the closeness to the center might have safeguarded its survival for being well behind the full impact of the wave of advance that – according to Fisher(1937) – follow a strict mathematical pattern of outward increasing frequencies, leaving only the center virtually unaltered. Lower microcephalin-1 frequencies in most of the regions where Y-DNA Hg D has a patchy distribution suggests that there is no evidence of Y-DNA Hg D having actively participated in spreading the microcephalin-1 gene, or to have spread by any “recent” geneflow whatsoever. In contrast, the patchy distribution of Y-DNA K2-M70 in Africa “may be a remnant of a more widespread occupation. Subsequent demic events introducing chromosomes carrying the E3b-M35, E3a-M2, GM201, and J-12f2 haplogroups may have overwhelmed the K2-M70 representatives in some areas.” – J. R. Luis et al.,2004: thus potentially accounting for microcephalin-1 in Africa due to K2 backmigrations.

The picture that emerges of is not fully consistent with the assumptions of the Recent Out of Africa model. The implied dominant position of African populations didn’t last long and was soon superseded by backmigrations that even brought exogenetic DNA from Asia. Moreover, the dating results for Y-DNA can’t be synchronized easily with autosomal DNA and only very globally with the dating results of mtDNA.

Y-DNA evidence against multiregionalism departs from the accepted age of the non-recombining Y-DNA mutations involved, and subsequently tries to verify the results of other disciplines, in this case paleoanthropology. Nobody ever bothered to reverse the verification. We could as well depart from paleoanthropology and verify our current level of understanding of the human genome.

So far, dating was globally based on three assumptions:

– Y-DNA is susceptible to random change since it virtually consists of 100% junk
– Mutation rates can be derived from the number of substitutions found between a chimpanzee sequence and a human sequence
– The Out of Africa scenario can be used as a circular argument to calibrate the age.

Later I will go into deeper detail to indicate the folly of each of these assumptions. The first efforts to come to a verification of mutation rates were indeed designed to indicate that the mutation rate should apply as an average to the whole Y-DNA (Xue et al., 2009). Unfortunately no full genome verification has been applied yet, in the latter study only 30% Y-DNA (~10MB) was tested. Lower rates up to factor 3 are still possible, though this will be hardly enough for Y-DNA mutation rates to be fully consistent to alternative evolutionary scenarios like the Early Out of Africa or Back and Forth theories, that depart from a common ancestor at about 2.7 million years ago, in  Africa. Multiregional populations that were not fully replaced much later by African arrivals of modern humans would possibly attest similar mutation counts at a maximum of 3% junk Y-DNA that can mutate freely against 97% “evolutionary” Y-DNA that remains fixed by selective forces. Could our ignorance of Y-DNA approximate 97%, or does our failure to produce credible Y-DNA dates that approximate tangible Out of Africa results have to do with our ignorance of how Y-DNA evolved?

Just imagine that Y-DNA Hg A and Hg B was already present in Africa, while Homo Erectus developed in Asia and returned in Africa about 1.8 million years ago to meet and breed there with older Human lineages. This would correspond to a scenario where DE and CF developed from M168 emigrants that already coexisted with A and B populations in Africa. Since M168 was indeed attested in Africa, in this scenario the main backmigration thus would have been contemporaneous to Homo Erectus/Ergaster and Y-DNA Hg E. In Africa they would have mixed with Homo Habilis, while the returning Homo Erectus only represented a subset of the Eurasian Homo Erectus: Hg D and Hg CF already branched off when they returned.

Unfortunately the Recent Single Origin scenario doesn’t give such a perfect fit to the genetic record. The flip-side of the apparent match between the current Y-DNA distribution and the paleo-anthropological evidence, however, is the recent dating of the genetic record.


  • Patrick D. Evans et al. – Microcephalin, a Gene Regulating Brain Size, Continues to Evolve Adaptively in Humans, 2005, link
  • Patrick D. Evans et al. – Evidence that the adaptive allele of the brain size gene microcephalin introgressed into Homo sapiens from an archaic Homo lineage, 2006, link
  • Tatiana M. Karafet et al. – New binary polymorphisms reshape and increase resolution of the human Y chromosomal haplogroup tree, 2008, link
  • J. R. Luis et al. – The Levant versus the Horn of Africa: Evidence for Bidirectional Corridors of Human Migrations, 2004, link
  • Y-Chromosome Phylogenetic Tree (FamilyTreeDNA), 2008, link
  • Yali Xue et al. – Human Y Chromosome Base-Substitution Mutation Rate Measured by Direct Sequencing in a Deep-Rooting Pedigree, 2009, link
Categories: DNA, Paleoanthropology

This is my (evolving) view on the would be “neutrality” of Y-DNA marker inheritance

December 6, 2009 Comments off

It has been demonstrated that the gene T-13910 for lactase persistency in particular (having an age of 5000-1000 years) spread rapidly. In my view, this event could explain the current distribution of especially Y-DNA R haplogroups by genetic sweep. It is no say that LP has a gender related impact. However, if the advantage would have been more pronounced for men, the selective forces that caused the gene to expand may have triggered a sex-biased geneflow that enhanced the increase of frequencies for these particular Y-DNA haplogroups.

The importance of mapping sex-biased geneflow is, that migrational patterns can only be visualized once the gene flow is accounted for.

Sex-biased geneflow by lactase persistency
* European admixture levels to Native Americans and African American attest higher Y-DNA and Lactase Persistency (LP) levels than can be warranted by autosomal results. European admixture on African Americans average 20%, while 23% R1b among African Americans against 47% R1b among Europeans would yield 49% European admixture, or a minimum of 25% assuming 0% female European contribution. This approximate the possible selective advantage of the European LP gene T-13910: an average root origin of 33% West Africans with current LP levels ranging 10% to 25% and 66% Central Africans with current LP levels between 3 and 16%, this would yield 27 to 40% European admixture to achieve the current 35% LP to African Americans compared to 78% LP among European Americans: probably the higher value is more correct since higher African LP levels are largely confined to cattleherders that for various reasons hardly contributed to the African American roots. The same bias in Hg R frequency levels can be suspected among Native Americans. This results strongly suggest natural behaviour of genes in some cases to be more telling than historic behaviour of people, especially when first contact between disparate people is followed by increased selection pressures caused by harsh conditions, diseases and changed nutrition patterns (e.g. milk).

* Gene flow due to the spread of advantageous genes is essential for keeping low migratory species together. At least from Upper Paleolithic onwards, humans classify as such a low migratory species, featuring both extreme low diversity and a huge expanse to remote and geographically isolated regions like Australia.

* Hg N1c (N3) and Hg R1a have distribution patterns that are virtually complementary one to the other one, suggesting close correlation possibly due to being spread contemporaneously and before R1b?

* Concerning the correlation between R1a and R2 in India Manoukian (2006) wrote: “The frequencies of R2 seem to mirror the frequencies of R1a (i.e. both lineages are strong and weak in the same social and linguistic subgroups). This may indicate that both R1a and R2 moved into India at roughly the same time or co-habited, although more research is needed.” Here, being complementary is related to a demic expansion that involved large population movements, where high variance became indicative to high arrival rates. Hg R2 obviously spread ahead of Hg R1a in one single event, due to an initial location closer to the Indian subcontinent than Hg R1a. Probably both were introduced in the main Indian subcontinent by invading Indo Aryans and only show vestiges of the original and essentially genetic frontwave that until then had arrived only as far as Pakistan. However, also the migrational spread of Hg R1a and Hg R2 in to India was definitely enhanced since it does not reflect real admixture rates: most probably by the same sex-biased advantageous gene, whether or not locally developing the behaviour of a mutational front wave.

* Hg R (R1b+R1a+R2) and Hg N1c Y-DNA haplogroups show distribution patterns that resemble the Fisher (1937) wave front for autosomal advantageous genes: they feature peak values far removed from their most likely location of origin. This suggests a sex-biased correlation of this specific haplogroups to one specific autosomal advantageous gene.

* A sex-biased expansion of an autosomal advantageous mutation would severely compromise the neutrality of non-autosomal marker genes. Along the sex-biased wave front, ALL genes of the first individual that initiated the gene flow or mutation will be diluted, EXCEPT for the advantageous autosomal gene, a sweep area located near to the mutation and the (biased) Y-DNA marker.

* Hg R (R1b+R1a+R2) peak concentrations make a global ringed pattern on the Eurasian map that centers in the surroundings of Northern Pakistan. Each ring has a high internal correlation that is difficult to explain by demic diffusion and should rather be explained by genetic diffusion.

* I propose a sex-biased two-rippled wave front correlated to the lactase persistency gene T-13910 to be responsible for the current global distribution pattern of Hg R (R1b+R1a+R2) and Hg N1c Y-DNA.The wave front was omnidirectional and had its epicenter close to northern Pakistan.
** The first ripple was composed of Hg R2 in a front wave going south (peak values in Sri Lanka) and south-west (peak values in Kurdistan) and R1b-M269 in a wave front going east (local R1b peak values in East Turkistan) and west. To the west, one R1b-M269 branch deviated SW towards the Caucassus (peak values in Ossetia) and another branch propagated to western Europe. Possibly halfway the western branch experienced the U106 mutation that propagated further on the wave front with increasing momentum to northwestern Europe, and the S116 mutation that crossed the Alps to the south and southwest of Europe. The wave encountered a natural barreer in Mesopotamia and northeastern Africa due to older lactase persistency genes that have been attested in this specific region. Another barreer was encountered to the east, probably due to a reduced cultural receptivity (China, “pre-Vedic” southern India).
** The second ripple followed suit and was composed of Hg R1a and Hg N1c (both having a complementary distribution). This ripple was also omnidirectional with increasing concentrations towards the edges, but shows a sharp cline where the first ripple already propagated the advantageous gene to near saturation levels. A western wave front reached Hg R1a peak values in Poland and Ukraine and Hg N1c peak values in Lapland. A northern wave front reached Hg N1c peak values in the arctic. An eastern wave front reached peak values in the Altai mountains. A southern wave front reached peak values in northern India.

* The virtual absence of LP T-13910 in the near east and northeastern Africa might be due to other lactase persistency genes already there. High values of LP T-13910 in western Africa, however, might point to additional physical migration patterns of cattle-rearing people that crossed areas where LP T-13910 could not pass merely by gene flow. R1b-M335 peak values in Cameroon that show virtually no cultural or language correlation might be (equally) due to a mutational wave front surfing on T-13910 going SW. The reason why the apparently superfluous East-African lactase persistency mutations did not propagate as successfully (Africa is still largely lactose intolerant) must have been the existence of sharp cultural boundaries of cattle people and nutrition patterns. Correspondingly, the reason why LP T-13910 was so succesfull in Eurasia must have been a cultural receptivity and the widespread availability of mixed economies (common from the ceramic Mesolithic and Neolithic periods onwards) and possibly also the large geographic extend of only a few cultural unities (Uralic, Indo-European?).

* Low estimates of Y-DNA haplogroup ages involve by definition an apert deviation from the one father-one son model. They would suggest the average genetic drift in human populations to be so huge that the process would resemble a perpetual bottleneck combined with perpetually huge growth scenarios to the most prolific lineages. Equally, the most prolific lineages are the most likely to be sampled and by this logic are most prone to high biased age estimates.

* Recognizing bias of Hg N1c and Hg R1a distribution patterns to Uralic people would diminish the argument of an eastern origin and be consistent with archeological studies pointing to a European origin. In particular, Uralic people could be derived of Swiderian post-LGM expansions that initiated in western Poland and continued to post-Swiderian cultures, reaching Russia and the Arctic region. Such an eastern impulse of the Uralic cultures would make an Uralic identity of Andronovo steppe cultures (being close to Samoyed and Ugrian territories) most probable. Ugrian steppe migrations from this region traditionally account for the Magyar migration to Hungary.

* Recognizing bias of Hg N1c distribution patterns to Sami people would increase the odds of an ancient presence of Sami in northern Europe. Uralic roots of presumed Sami borrowing to Scandinavian languages and archaic language features would indicate an ancient Uralic identity of the Sami people. Close contact with Finnish Uralians could have prevented isolation and might have drawn the Sami language back to the Uralian family. Thus, considering the absence of post-Swiderian horizons in western Scandinavia, the Sami presence may extend to Ahrensburg origins. The original Uralian homeland could thus be located between the North Sea and Western Poland.

* The African-like scenario of a long initial isolation for cattle rearing populations, could have caused Hg R of Central Asiatic cattle reares to flourish within small related communities, subject to genetic drift. Thus, Asiatic LP became tightly coupled to a broader Hg R spectrum since at LP T-13910 expansion time Hg R was already diversified in R1b, R2 and R1a, and available in high homogeneous concentrations among different geographically connected communities. Like so, the expansion of the gene to the south may have been triggered by a community on the southern fringes of the theorized Central Asiatic cattle rearers territory, while the expansion to the west, north and east was triggered by Hg R1b carrying communities. Small numbers of R1b-carrying cattlerearers should have migrated physically in the direction of Anatolia and Africa, where older LP mutations would otherwise have inhibited a typical wave front spread of the gene. A community carrying R1a in the center first expanded and subsequently triggered a second wave front that roamed Central Asia, where the LP T-13910 concentration was still low in the wake of the first wave front that only gained momentum and even LP T-13910 saturation levels much further away, conform the mathematics of a wave front. The advance of R1a straight to the north apparently became intertwined on the wave front with Hg N, possibly due to an additional adaption necessary to survive the cold that became incorporated in the northern wave front.

* The cline of LP and Hg R suggests that (sex-biased) evolutionary pressures might have been considerably less than the mere physical nature of the initial front wave. Local diversity of R1b at the Pyrenees, including rare subclades, resembles a terminal moraine that temporarily froze a mutational frontline on the edge of wide area incremental population pressures and shifts in glacial motion, rather than the direct result of local selective advantages. LP only can be of selective advantage in cultural areas that switched to dairy production and cultural areas can’t switch at will to dairy production without a substantial part of the population having the LP gene. Thus, the advance of the front wave tied to an advantageous gene like LP would have stopped at barreers local culture hampered or contradicted a straight advance of the gene. Also, the higher survival rates on the other side of the cultural barrer would have caused population pressures that caused LP carrying people to be propelled into neighbouring areas by normal marriage patterns or diffusion. The Pyrenees might have been such an area, where LP and Hg R accumulated like drift-ice, dropping mutations at increasing subclade diversity like boulders at a terminal moraine, until the front wave could move on, as well as the mutational frontline when it crossed the Pyrenees to Iberia and the Atlantic Ocean. The Caucasus might have represented a similar barreer to LP T-13910 related population pressures from the north, though rather sustained by a combination of survival rated to the north and other LP genes already in place to the south: Thus, contrary to the Pyrenees the mutational frontline never crossed the geographic barreer and an even higher Hg R diversity should be expected in the Caucasus.

* The advance of sex-biased Y-DNA like Hg R and Hg N in combination with LP T-13910 would have progressed into new areas as long as the genetic survival advance in neighbouring areas was higher and incremental demic pressures on the new areas were maintained. This process could have been accompanied by a moderate diffusion of eastern influences, with a clear survival of local culture. Only a cultural change to dairy production to take a competing advantage out of LP could have turned the tide. A reverse demic pressure could have been the result, for instance U106 moving east “against the current”.

Summary of what may have happened in Europe:

* A first (thin) wave of R1b by gene flow along a mutational frontline on a Neolithic substratum, arriving as homogeneous M269 in Europe (probably together with the first cattle about 5000 BC, late Neolithic through Anatolia and probably not related to etnic changes)
* A second wave of (homogeneous) R1a and R2 by gene flow, that didn’t reach as far as the first wave because of a LP saturation effect between 2750-2500 BC.
* In the meanwhile, incremental SNP changes of R1b-M269 between 5000-2500 BC on the route and the development of U106/P312, probably within an isolated pocket in NW Europe.
* Expansion to the east of (migrational) U106/P312 once the saturation process of LP genes was complete
* A continuation of the gene flow wave on a Neolithic substratum, carrying homogeneous subclades of P312 further south on a mutational wave front.
Later expansion may have been partially congruent to Bell Beaker and Bronze Age culture.

The early availbility of R1a in Eastern Germany (Eulau) illustrates how much of R1a was pushed back later by migrational L11+. The incredible similarity of Eulau R1a (2600 BC) to the current STR averages for most Hg R1a between Germany and Central Asia, and the geographic STR cline in eastern direction towards ancient Andronovo Hg R1a and further as described by Anatole Klyosov (2008), may still reflect vestiges of an original mutational wave front, where the effects of sex-biased selective forces must have preceded migrational expansions.


  • Vallone et al. – Y-SNP Typing of U.S. African American and Caucasian Samples Using Allele-Specific Hybridization and Primer Extension, 2004, link
Categories: DNA