Archive for March, 2010

Denisova Cave and the Mystery of the mtDNA Phylogenetic Tree

March 27, 2010 32 comments

Nobody expected a great surprise. Genetic testing of the little finger of an early hominin child found in the Siberian Denisova Cave, Kostenki, in the middle of archeological remains pertaining to Upper Paleolithic culture, would almost for sure confirm DNA similar to ours. There was a slim change that the pinky belonged to a Neanderthal from the neighborhood that got lost, but everything pointed at a an unequivocal member of the advanced group of hominins responsible for introducing symbolic art all over the world, the so-called anatomically modern humans (AMH).

The collection of personal adornments and artifacts suggestive of symbolic behavior from the Early Upper Paleolithic deposits of Denisova Cave, Altai, is one of the earliest and the most representative of the Upper Paleolithic assemblages from Northern and Central Asia. Especially important is a fragment of a bracelet of dark-green chloritolite, found near the entrance to the eastern gallery of the cave in the upper part of stratum 11. The estimated age of the associated deposits is ca 30 thousand years. According to use-wear and technological analysis, techniques applied for manufacturing the specimen included grinding on various abrasives, polishing with skin, and technologies that are unique for the Paleolithic – high-speed drilling and rasping. The high technological level evidences developed manual skills and advanced practices of the Upper Paleolithic cave dwellers. (Derevianko et al., 2008)

Humans spend more time gathering around the campfire to celebrate their victory on Nature, only to challenge evolution in an entirely new way.

Neanderthal were readily dismissed as potential authors of local Upper Paleolithic art, due to what boils down to a deep distrust against anything that would deem them capable of such a feat, and they were the only other early hominins around that we knew of – at least culturally speaking, since we don’t have much more than a little pinky after all. And indeed, the first genetic results showed the world was right about one thing: the little finger did not belong to a Neanderthal child. But nobody could have guessed how wrong the usual lot of junk scientists were about almost anything else. This was not the child from the same flesh and blood of modern humans, but a member of a previously unknown ancestral human subgroup.

Dr. Johannes Krause, of the Max Planck Institute in Germany, sequenced the entire mitochondrial DNA (mtDNA) genome and showed almost two times as many differences to modern human mtDNA as does Neanderthal mtDNA. You can find the genome at GenBank or EMBL using record ID FN673705 and check it out by yourself: Even Neanderthal was a close relative to modern humans compared to this hominin!

A phylogenetic analysis similarly shows that the Denisova hominin mtDNA lineage branches off well before the modern human and Neanderthal lineages (Fig. 3). Assuming an average divergence of human and chimpanzee mtDNAs of 6 million years ago, the date of the most recent common mtDNA ancestor shared by the Denisova hominin, Neanderthals and modern humans is approximately one million years ago (mean = 1,040,900 years ago; 779,300–1,313,500 years ago, 95% highest posterior density (HPD)), or twice as deep as the most recent common mtDNA ancestor of modern humans and Neanderthals (Krause, 2010)

Established paleo-anthropology is now faced with the challenge to rewrite the book of human evolution. And of course first things first, the dates were adjusted to make a better fit with pre-AMH cultures:

We note that the stratigraphy and indirect dates indicate that this individual lived between 30,000 and 50,000 years ago. At a similar time individuals carrying Neanderthal mtDNA were present less than 100 km away from Denisova Cave in the Altai Mountains, whereas the presence of an Upper Palaeolithic industry at some sites, such as Kara-Bom and Denisova, has been taken as evidence for the appearance of anatomically modern humans in the Altai before 40,000 years ago. (Krause et al., 2010)

Nobody has ever heard of pre-AMH bracelets, so let’s conveniently forget for a while about that fragment of a polished bracelet with a drilled hole, that was found earlier in the same layer that yielded the bone. Is it possible that here we have evidence that points to a third species, next to Neanderthal and AMH? A species, that might have been as civilized as a AMH, or a beast our ancestors didn’t breed with, or anything else that didn’t involve “us” so we can understand? The publication of Krause carefully omitted this pressing question and the word went out that for sure Krause had already access to autosomal data that could explain why. That Denisova child might have been anything but a Yeti.

Sure, mtDNA doesn’t make a species, no matter how different it may be from modern humans. There was no need for Krause to mention this. But divergence of mtDNA lineages has been taken as an indication of divergent hominin developments before. Explicitly with respect to Neanderthal, whose attested and validated mtDNA lineage was deemed sufficiently homogeneous and different from ours to provoke a definite ordeal. However, now we have the Denisova mtDNA sample to teach us modesty. After all, there are lots of things about mtDNA that need better understanding before we can even attempt to solve the question of how the modern forms spread, and how they evolved.

All conspires against the notion that paleogenetic mtDNA of Neanderthal, and now even more so the mtDNA from Denisova Cave, might be the precursor of modern mtDNA. It couldn’t have evolved so rapidly to modern mtDNA. A study on 44,000-year-old remains of Adelie penguins in Antarctica even confirmed the potential overestimation of the mutational change that is used for dating mtDNA of paleogenetic samples. This stems from a bias that is caused by nonsynonymous mutations, involving notable coding changes that are potentially deleterious and most likely won’t persist very long due to natural selection. Accordingly, only a portion of the mutational changes can be observed over a longer period of time:

Rates of evolution of the mitochondrial genome are two to six times greater than those estimated from phylogenetic comparisons. Subramanian et al., 2009)

The investigation showed that only the effect of synonymous mutations (“silent mutations”) in the mtDNA genome, that involve coding synonyms for the same proteins, remain stable. To retrieve the phylogenetic dates only these “silent mutations” should be measured, ie. changes on coding genes that produce coding synonyms that won’t affect the function of the gene. Mutations that effectively change the functionality of a gene and thus are most likely to be (slightly) harmful, get lost over time, since such mutations would finally bring about the extinction of a lineage and thus shouldn’t count for calculating the age of surviving lineages. The mtDNA “molecular clock” thus should only involve properly identified “silent mutations”.
This results were also important for interpreting the paleogenetic mtDNA samples of hominins.

Mildly deleterious mutations initially contribute to the diversity of a population, but later they are selected against at high frequency and are eliminated eventually. Using over 1,500 complete human mitochondrial genomes along with those of Neanderthal and Chimpanzee, I provide empirical evidence for this prediction by tracing the footprints of natural selection over time. The results show a highly significant inverse relationship between the ratio of nonsynonymous-to-synonymous divergence (dN/dS) and the age of human haplogroups. Furthermore, this study suggests that slightly deleterious mutations constitute up to 80% of the mitochondrial amino acid replacement mutations detected in human populations and that over the last 500,000 years these mutations have been gradually removed. (Subramanian, 2009)

Interestingly, this dN/dS ratio among Neanderthal was initially reported strikingly high.

These results suggest that slightly deleterious amino acid variants segregate within populations, and that differences in the intensity of purifying selection may affect mtDNA dN/dS ratios. Previous estimates based on mean pairwise differences (MPD) within the mtDNA HVRI suggested that Neandertals (MPD = 5.5) had an effective population size similar to that of modern Europeans (MPD = 4.0) or Asians (MPD = 6.3), but lower than that of modern Africans (MPD = 8.1) (Krause et al., 2007b). Recent population genetic analyses have revealed a higher mtDNA amino acid substitution rate (Elson et al., 2004) and relatively more deleterious autosomal nuclear variants (Lohmueller et al., 2008) in Europeans than in Africans, presumably due to the smaller effective population size of Europeans. Thus, it seems plausible that Neandertals had a long-term effective population size smaller than that of modern humans. (Green et al., 2008)

However, the new information supplied by the Denisova hominin reveals this assumed feature of Neanderthal mtDNA was actually a mistake:

The 12 proteins encoded by the Denisova hominin mtDNA (excluding ND6, Supplementary Information) show low per-site rates of amino acid replacements (dN) when compared to the per-site rates of silent substitutions (dS), consistent with strong purifying selection influencing the evolution of the mitochondrial proteins (dN/dS=0.056). Notably, when the evolution of mitochondrial protein-coding genes in modern humans, Neanderthals, chimpanzees and bonobos is gauged in conjunction with the Denisova hominin mtDNA, a previously described reduction of silent substitutions causing an increased dN/dS in Neanderthals is not observed. This is probably due to a more accurate reconstruction of substitutional events when the long evolutionary lineage leading to modern humans and Neanderthals is subdivided by the Denisova hominin mtDNA (see Supplementary Information) (Krause et al., 2010)

The immediate result of this new finds is that an earlier proposed reduction in length of the Neanderthal mtDNA lineage “about three times as large as would be expected if it was entirely due to the age of the fossil” (Green, 2008), resulting in an earlier common ancestor to modern humans, is wrong. The shrunken phylogenetic tree was accordingly corrected for by Krause: the mean age of the most recent mtDNA ancestor of modern humans and Neanderthal went down from 660.000 t0 465,700 years ago.

(mean = 465,700 years ago; 321,200–618,000 years ago, 95% HPD) (Krause et al., 2010)

The feature that contemporary dN/dS values of modern humans are high, especially among Europeans, also corresponds to current assumptions that concern a younger age or (in the case of Europeans) of a smaller effective population size. May this be another lousy interpretation of results that are barely understood? This could be another example of a solution that supplies an easy way out of a complex issue.

There might be more. COX2 is a coding gene located on mtDNA. According to Green et al.(2008):

COX2 has experienced four amino acid substitutions on the human mtDNA lineage after its divergence from the Neandertal lineage […]

Fixed mutations indeed tend to define both human lineages as mono-phyletic blocks. But the paper only mentions COX2 as a potential indication of divergent evolution, and due to the new information revealed by the Denisova hominin nothing remains of Green’s assertions that Neanderthal coding mtDNA is strikingly different from modern human mtDNA. The main argument why this would be irreconcilable with a continuous development can now be rejected:

The observation of four nonsynonymous substitutions on the modern human lineage, and no amino acid changes on the Neandertal lineage, stands in contrast to the overall trend of more nonsynonymous evolution in Neandertal protein-coding genes (Table 1), and deserves consideration. (Green, 2008)

There is NO overall trend among Neanderthal towards a more nonsynonymous evolution, hence the four new proteins that correspond to four nonsynonymous substitutions on the modern human lineage do not indicate a striking new tendency, since this kind of mutations happened all the time, also among Neanderthal.
The age calculations gain in reliability once the synonymous mutations involved are better identified and harbored on the phylogenetic tree, by comparing more hominins and branches. Quite considerable purifying selection has now been identified as applicable to both Denisova and Neanderthal mtDNA. However, the mtDNA of an old skeleton in Australia already showed us that neither of this leads us closer to the mtDNA of modern humans.

Whatever the nuance of details, that scream variety and continuity in human evolutionary development, we can’t deny a striking, almost exclusive unity of AMH mtDNA compared to the different forms that have been recovered from Neanderthal and – even more – Denisova:

The genealogies of mtDNA sequences in most human populations, including Aboriginal Australians, characteristically have very little hierarchical branching structure. This pattern of sequence variation is consistent with a population expansion following a population bottleneck and is generally taken as supporting the recent out of Africa model. Under this model, all contemporary sequences spread globally with an expanding population that replaced all other people and all other lineages. Africa has been postulated as the source of the expansion because some populations in Africa have more sequence diversity than populations anywhere else. (Adcock et al., 2001)

Almost, since the discovery of ancestral mtDNA of the gracile early human, found at Lake Mungo, Australia (code named LM3, age 62 kya), that is unmistakably an AMH, also attests the extinction of quite distinct outliers. There must have been a huge and progressive selective thrust towards modern mtDNA. The mtDNA of LM3 was kind of “modern” alright, but definitely the genetic distance fell outside the range of modern humans. The investigators observed this find poses a serious challenge to the “interpretation of contemporary human mtDNA variation as supporting the recent out of Africa model” (Adcock et al., 2001), effectively reducing Africa as a refuge for outgroups that have accumulated change and drifted apart rather than being a true indication of the source of all AMH related mtDNA. But even more so, the find strongly indicates that the current lack of hierarchical branching structure among humans can’t be understood as the direct result of a succession of AMH migrational waves alone. Some waves phased out and lost their origin from the record. Could it be possible that something about mtDNA triggered the worldwide substitution of extremely divergent older forms by the reduced array of current forms? Then how did this happen?

The mtDNA genome, modelled as a circle.

Let us regard the issue in a wider genetic perspective and forget about cheap scenarios of cannibal hominins exterminating each other, a view that conveniently ignores autosomal evidence of inter-hominin gene flow. One little segment of non-coding mtDNA can be found on the Displacement (D-) loop or control region, that is involved in repair activities. It has an analogy in the telomers of nuclear DNA, that are highly prone to insertion and deletion processes. This little region may be subject to the random change and stochastic speed-density that are necessary to infer a neutral “molecular clock”, but the location of this region on the mitochondria introduces a substantial bias in the basic assumption of overall neutrality. I will return at this issue.
Several studies have demonstrated the ongoing transfer and integration of mitochondrial DNA sequences into nuclear chromosomes. The evolutionary inclination of mtDNA genes to move from the D-loop control zone to the nuclear autosomal part of the DNA could be studied in more detail on the paleogenetic sample of an AMH fossil found near Lake Mungo, Australia, dated 40kya (Bowler et al.,2003):
“His mtDNA belonged to a lineage that only survives as a segment inserted into chromosome 11 of the nuclear genome, which is now widespread among human populations.” (Adcock et al., 2001)

This particular strand of early human (AMH) mtDNA vanished from the mitochondrial record ever since, all over the world, but the insertion in chromosome 11 flourished, especially outside Africa:

Overall, 39% of chromosomes tested carried the insertion. In four African populations, the frequency of chromosomes carrying the insertion ranges between 10 and 25%, whereas it varies between 38% and 78% in populations tested in Europe, Asia, Oceania, and South America.(Zischler et al., 1995)

Assuming a lower evolutionary rate in nuclear DNA, “these mitochondrial integrations might preserve the ancestral state of the mitochondrial sequence that existed at the time of transposition and could therefore be regarded as ‘‘molecular fossils.’’” (Zischler et al., 1998). Previous investigation on a similar, albeit much older Insert on chromosome 9 that “took place on the lineage leading to Hominoidea (gibbon, orangutan, gorilla, chimpanzee, and human) after the Old World monkey–Hominoidea split” (Zischler et al., 1998), that happened in the range of 17–30 MYA in a common ancestor of all hominoids, already established the value of nuclear insertions for reconstructing ancestral mitochondrial sequences of the Most Recent Common Ancestor (MRCA):

Thus, the MRCA sequence deduced from homologous integrations in different species will represent the ancestral mtDNA sequence more reliably and with less sequence ambiguities than an ancestral sequence deduced from the fast-evolving mtDNA sequences. (Zischler et al., 1998)

The remarkable affiliation of the autosomal Insert with both LM3 and hominoid mtDNA. The newly discovered, potentially ancestral affiliation with the Denosova hominin is not drawn.

The Insert on chromosome 11 definitely suggests fossil information of some early AMH individual, or at least of a hominin that interbred with early AMH. The closest match to the mtDNA of this particular individual was indeed an AMH, the gracile LM3 dated 40kya (Bowler et al., 2003) found in Australia at Lake Mungo. However, a simple comparison of the Insert to the current genome of modern human mtDNA reveals that this individual can’t possibly be the direct ancestor of modern human mtDNA. No close mtDNA matches of LM3 nor the Insert survived and the mtDNA of LM3 doesn’t indicate direct matrilinear inheritance of the original mtDNA source of this autosomal Insert either.

The LM3 Sequence Belongs to an Early Diverging mtDNA Lineage. The divergence of the LM3 sequence before the MRCA of contemporary human sequences is indicated by its grouping with the Insert sequence, which other reports have suggested diverged before the MRCA of sequences in living humans.
Although this analysis did not reliably establish an early divergence of the LM3/Insert lineage, it demonstrated that the lineage is unusually long. (Adcock et al., 2001)

This presentation of the Insert as a member of a single branch together with LM3 may be an oversimplification. The location of the Insert at the mtDNA phylogenetic tree of humans suggest an even more pronounced outlier:

Upon comparing 243 bp of a human-specific integration (Zischler et al. 1995) that corresponds to the conserved part of the mitochondrial D-loop of all available hominoid (n=14) and human (n=261) mtDNA sequences, only two insert-specific substitutions were traced, with both the ape mtDNA sequences and all human mtDNA sequences being identical at these positions. (Zischler et al. 1998)

Salient detail is that the two Insert specific substitutions (A on 16259 and C on 16288) are now covered by the mtDNA of the Denisova hominin. Even though the other differences with Denisova are big enough to exclude a close affiliation, this remarkable detail invites to the tentative proposal that the divergence of the Insert sequence could have happened long before the MRCA of human sequences that also include LM3.

Between 16,259-16,381 the mtDNA variation of the Insert nucleotides is covered by the corresponding nucleotides of apes, Lake Mungo 3, current aboriginal polymorphisms (not drawn) and the Denisova hominin.

This rare scope on a deep Eurasian affiliation, combined with extant aboriginal polymorphisms that echo the survival of Insert and LM3 features in the haplogroup N and M branches of modern mtDNA, suggest a much more complicated phylogenetic tree than the one currently in use. Aboriginal mtDNA polymorphisms drawn in the figure of Adcock et al., 2001 (above) are part of a mixture of the closely related haplogroups N (~P?) and M (Q?) that up to now define the earliest Out of Africa scenario. Together they could be closer to an extinct group of Eurasian outliers than African branches separately. Also typical East Asian loci of mtDNA show a remarkable similarity, making the case of African branches being ancestral to haplogroups N and M less straightforward. The establishment of any “reversed tree”, however, is hampered by the apparent extinction or extreme “pruning” of what might have been an enormous Eurasian mtDNA variability. Any scenario that reverses the tree should account for this low extant Eurasian variability in comparison with Africa.

Let’s return to the assumed “neutrality” of mitochondrial DNA inheritance. High variability of the control region might suggest otherwise. One of the prerequisites of fast evolution is a fast mechanism underneath genetic change, and the purpose of fast mtDNA mutations could be just that, to put the precondition of rapid evolutionary change. Anyhow, a similar observation was made concerning the massive STR of chimps on the Y-chromosome, that seem to be secondary to the incredible evolutionary changes on the Y-chromosome as observed in the recent study of Hughes et al. (2010) I already wrote about here.

A set of interesting differences of mtDNA between humans is located on the Hypervariable Region (HVR). Most strikingly, HVR is not highly variable per definition. For instance, investigations on the Ayu fish (Takeshima et al., 2005) revealed the Hypervariable region may also turn into a Hypovariable region, what suggests a special functionality of the property defining HVR (or general D-loop) variability. And a substantial susceptibility to damage.

The mitochondria continuously reproduce themselves at intervals averaging about 2 weeks, like bacteria by a process of binary fission. They generate most of the cell’s (chemical) energy supply and because mitochondria use oxygen as an electron acceptor, they produce harmful free radicals that may cause genetic damage, often deletion mutations. This free radical damage to mtDNA cannot be repaired, basically because the regular repair mechanisms of the cells can’t access the mitochondria and the mitochondrion has no repair mechanism of its own. Therefore, mitochondria accumulate damage at each mitochondrial generation, what gradually leads to malfunction and ultimately affects the health of the organism as a whole.

However, this dreary scenario must have some constraints, or else all life on earth would already have ended millions of years ago. Somehow the reproductory system must have been exempted from this process, or at certain circumstances, and also it seems the genetic damage to mitochondria can be slowed down by exercise, both physical and mental, but especially by consuming antioxidants like vitamin C or omega-3 fatty acids. These are abundant in fresh fruit, raw meat and fish, indispensable supplements to the species that lost the functionality of the L-gulonolactone oxidase (GULO) gene – amongst whom one of the two major primate suborders, the Anthropoidea (Haplorrhini), that happens to include human beings, together with tarsiers, monkeys and apes. Originally meant as a genetic “improvement” for getting rid of the old and weak when food shortages occurred, ie. those most badly in need of antioxidants to remain healthy, the loss of this gene also effectively confined this suborder of primates to subsistence in the tropics. Only humans succeeded in finding new habitats in colder climates. They left the hot places where fruits were available all year round and traditionally made up an important addition to the menu, because they could. Only humans evolved into great hunters, and developed the necessary skills to catch fish, in order to compensate for the irregular availability of fresh fruits. Notwithstanding unfavorable climates, they managed to keep their necessary supply of antioxidants at a save level. And they did, for hundreds of thousands of years. Until everything changed at the eve of Upper Paleolithic – when human cultural advance reached a critical level.

What went amiss when humans reached their first cultural highlights? Their success triggered important improvements in their living standards, that moved their prime focus away from the concerns of harsh survival, and towards the community around the fire. They spend more time preparing their meals, started to cook their meat and fish and thus destroyed their main antioxidant food supplies. Degenerative diseases made their introduction and invoked new selective pressures, that caused a steady gene flow from the south to rejuvenate the slowly degenerating mitochondrial lineages in the north. In the mean while females ceased to worry about the survival of the fittest and developed a preference for “feminine looking men over their more rugged counterparts” (DeBruine et al., 2010), triggering the most notorious changes in the human anatomy that resulted in Anatomically Modern Humans as a progressive tendency all over the world.

However, this does not fully explain the current low overall variability of mtDNA even in fruit-rich tropical territories in comparison to the attested mtDNA of early AMH such as Lake Mungo 3. Still, cultural level related natural selection might be a good trail to follow.

Booming AMH culture most probably also entailed a closer contact between different groups within a wider economical areas. For sure this new behavioral patterns would have initiated a catastrophic increase of contagious diseases as soon viruses and bacteria could circulate freely among newly interconnected communities. However, this also implies a strong relation between resistance against (new) infections and mtDNA, that vastly exceed the benign effects of Vitamin C. The relation between mtDNA, antioxidants and the development of new “genetic” cures may reach a lot further. At this point it is tempting to regress to the behaviour of mtDNA and its facility to travel to nuclear DNA, and evaluate the genetic potential of mtDNA as a genetic laboratory against new diseases. Indeed, the immune system is where human DNA might have evolved most and is where most human variability occur.

Despite the high homology between chimpanzee and human genes at the level of amino acid sequences, human genome contains 1418 genes that do not have direct orthologues in chimpanzee, many of which are related to immune defence.
A number of genome-wide scans for positive selection have recently been performed (Wagner, 2007). They confirm that many immune genes and their regulatory sequences have been the subjects of positive selection in humans.
Population genomics is still in its infancy and the specific predictions may vary among studies but this is where future discoveries are anticipated. (Danilova, 2008)

Then, survival of just one little branch of early human mtDNA must point directly to the main focus of Upper Paleolithic development. Of the early mtDNA strands only those that accumulated in Africa were safeguarded against the effects of progressive damage, due to the continuous availability antioxidants. But in the center of change the preconditions for rapid change were set, including the extinction of mtDNA that did not meet the new standards of natural selection against the inevitable pandemics of cultural cohabitation and coexistence. Relatively low population density prevented the accumulation of high haplotype diversity, and the surviving mtDNA haplogroups in Eurasia obliterated all traces of a long, rich and diverse hominin history. To the effect that the false positives of mtDNA lured the public opinion into thinking that a long list of pre-AMH hominins, that include famous names like Neanderthal, Peking Man, Rhodesian Man, Denisova hominin etc., became extinct.

We can’t solve the origin question with a narrow scope, since the only truth is that we still don’t know. However, the Denisova hominin shows us one important clue: the more we know, the more complicated the solution. And most probably, the more hominins involved.


  • Krause et al. – The complete mitochondrial DNA genome of an unknown hominin from southern Siberia, 2010, link (paysite): try here
  • Krause et al. – A complete mtDNA genome of an early modern human from Kostenki, Russia; 2010, link
  • Derevianko et al. – A Paleolithic Bracelet from Denisova Cave, 2008, link
  • Howell et al. – Molecular clock debate: Time dependency of molecular rate estimates for mtDNA: this is not the time for wishful thinking, 2008, link
  • Adcock et al. – Mitochondrial DNA sequences in ancient Australians: Implications for modern human origins, 2001, link
  • Ovchinnikov et al. – Molecular analysis of Neanderthal DNA from the northern Caucasus, 2000, link
  • Orlando et al. – Revisiting Neandertal diversity with a 100,000 year old mtDNA sequence, 2006, link
  • Green et al. – A Complete Neandertal Mitochondrial Genome Sequence Determined by High-Throughput Sequencing, 2008, link (paysite): try here
  • Takeshima et al. – Unexpected Ceiling of Genetic Differentiation in the Control Region of the Mitochondrial DNA between Different Subspecies of the Ayu Plecoglossus altivelis, 2005, link
  • Sankar Subramanian – Temporal Trails of Natural Selection in Human Mitogenomes, 2009, link
  • Subramanian et al. – High mitogenomic evolutionary rates and time dependency, 2009, link
  • Zischler et al. – A nuclear ‘fossil’ of the mitochondrial D-loop and the origin of modern humans, 1995, link
  • Zischler et al. – A Hominoid-Specific Nuclear Insertion of the Mitochondrial D-Loop: Implications for Reconstructing Ancestral Mitochondrial Sequences, 1998, link
  • DeBruine et al. – The health of a nation predicts their mate preferences: cross-cultural variation in women’s preferences
    for masculinized male faces, 2010, link
  • Nadia Danilova – Evolution of the Human Immune System Evolution of the Human Immune System, 2008, link
  • Allard et al. – Control region sequences for East Asian individuals in the Scientific Working Group on DNA Analysis Methods forensic mtDNA data set, 2004, link (paysite), try here
  • Bowler et al. – New ages for human occupation and climatic change at Lake Mungo, Australia, 2003, link
  • – Global human mtDNA phylogenetic tree, 2010, main

Recommended reading:

The Siberian Brotherhood of Mankind

March 18, 2010 2 comments

Buffalo Bill meets his Indian brothers.

Once upon a time there were two brothers. Well, actually there were a lot more of them, but these two were very special: they were destined to become the most prolific progenitors of the world. Their offspring divided in two. Those that went east to cross the Bering Strait, to become the most prominent inhabitants of the New World as far as Tierra del Fuego, were the descendants of one brother, while the numerous descendants of the other brother that remained hither caused a shock-wave in the Old World that could be felt as far as West Africa, the Atlantic and India. Like in a fantasy tale, both people founded incredible civilizations, each highlights of human talent and splendor. Their feats were almost unequaled and competing – even though they were separated by two mighty oceans. Each completely forgot about the existence of the others, until the people on the hither side learned how to tame the waves and crossed the oceans. Their final reunion should have been a reason for joy and celebration but when they met again, 500 years ago, oh pitiful wretches! Moans of distress filled the air and blood soaked the earth. The once great people of the New World never recovered the blow of what could probably qualify in history as the most tragic example of fratricide.

Human Y-DNA genetic tree. The yellow square indicates the relationship between the Eurasian R-clade (the world's most prolific clade), the Q-clade (that dominates the Americas), and the ancestral P-clade.

In the genetic tree these two people are nowadays known as members of the R-clade and the Q-clade. The offspring of the multitudes of other brothers and close patrilinear relatives that couldn’t catch up with their success, dwindled, but some of them survived in fringe areas on both sides of the oceans. For convenience sake these people are considered members of the same ancestral clade as their fathers, the P-clade. Geneticists would easily recognize the Y-DNA markers of each clade, and say that their respective haplogroups were R, Q or ancestral P. Together they form a superclade on a single branch of the genetic tree. They make up a substantial part of the human population, but still they represent nothing but a single branch that on a paleolithic scale can’t be considered very ancient.

Global approximation of the pre-colonial distribution of the P-clade, that include the R-clade (red, west) and the Q-clade (yellow, east and Americas). Note that there is a substantial overlap in Central Asia.

Where did those very successful P-derived haplogroups originate? The story of the two brothers may be an undue simplification since the full development of each clade involved nomads that probably roamed wide swaths of territory during many generations. The Old World was subject to an accumulated prehistory of up to a million years in every single corner, what tend to render our information for reconstructing the relevant demographic events almost intangible. Their legacy doesn’t include a unified language, not even overall genetic uniformity. Their genetic “kind” remains essentially restricted to the male lineage, that never became numerous in most of Africa and remained virtually absent in Oceania and the Far East. The Near East traditionally harbored a rich variety of their male genes in all stages of development, but so far nothing points at a historic predominant position in the region in comparison to other clades that eg. include notorious “stayers”, defined by haplogroup markers like Hg J and Hg E. These current distribution issues hardly give any clue about the origin, especially since more often than not the trail leads to desparate isolated and remote areas, or otherwise to historically cosmopolitan areas that received too much foreign visitors to reveal unambiguous answers. The only security we have is that on a genetic scale those three clades were exceptionally close related along the male lineage and thus that the incredible success of two P-derived branches may point to something systematic underneath, that also applies to the virgin soils of the Americas.

In stark contrast to the abysmal disagreements that surrounds our knowledge about the peopling of the Old World, the disagreements that concern the Americas are almost negligible. The Q-clade entered through the Bering Strait in one or two waves and were the first human beings to do so. They are theorized to originate in southern/central Siberia:

Globally, Y-chromosome data therefore emphasize the critical role of southern/central Siberia in the peopling of the Americas, since this region appears to be at the origin of two major male migratory waves of colonization. The data presented here are also consistent with the intriguing possibility of ancient links between proto-Europeans and proto–Native Americans, an idea that has been put forward in previous Y-chromosome studies (Karafet et al. 1999; Santos et al. 1999; Wells et al. 2001; Lell et al. 2002). An ancestral connection between these groups has also been suggested on the basis of morphological (Brace et al. 2001) and mtDNA (Brown et al. 1998) data and could ultimately trace back to ancient east/west human dispersals from a common source in central Asia. In agreement with this scenario, the P-M45 lineage has been found to be oldest in central Asia (Wells et al. 2001; Zerjal et al. 2002), where the Tuvan population includes haplogroups M242, M45, M173, and Tat, which are now dispersed in Europe and/or America. (Bortolini et al., 2004)

More recent migrational events, of new people that never succeeded to be equally successful in entering America, caused their kind to become rare in East Siberia, thus cutting the umbilical cord that once must have interconnected the great P derived clades on both sides of the Bering strait. We can’t easily derive from facts the precise events that should have linked the clades mentioned above with proto-europeans, but the link to proto-Native Americans is virtually straightforward and unequivocal. This security may serve as a sound base to analyze and verify the processes involved in deeper detail. Let us depart from the general agreement, that concern an origin from Central Asia and widespread habitation of the Americas during the end of the last glacial period, known as the late glacial maximum, around 16,000 — 13,000 years before present. How should this affect our assumptions that link medium term migrational processes to the origin of languages and genetic variability, and how the corresponding observations should contribute to our understanding of what happened elsewhere, ie. in the Old World, including Europe?

First of all, let us consider the pre-columbian era. Native Americans feature a seemingly disproportionate amount of linguistic groups and isolates, that at first sight hardly agree with the homogeneous origin implied by one or two migrational events from the same direction. Only the Na-Dené languages of North America, that include speakers of Athabaskan languages like Apache, seem to find a distant relative in the virtually extinct Ket language of Central Asia – that indeed feature the Q-haplogroup in remarkable high proportions. The languages are tentatively grouped together in the Dené-Yeniseian language family. Possibly the proto-Native Americans migrated wholesale to the Americas and didn’t leave much of a trace in Asia behind, or the descendants of their relatives that once spoke related languages in Asia became overwhelmed by groups that entered later into Siberia from the southeast, from the south or from the west across the Ural mountains. Both scenarios may apply, especially considering the advance of additional haplogroups as far as the Pacific coast of East Siberia that remain strikingly rare or absent in the Americas, like Hg N, several ancestral Hg C varieties and even Hg D. We don’t know about American Hg R, since up to now the ample presence of this haplogroup among virtually all Amerindian groups has been interpreted as recent European admixture. As a side-note could be mentioned that claims of Amerindian Hg R are indeed made and may find some confirmation in STR mutations on the Y-chromosome whose statistical properties deviate significantly from European Hg R (eg. less than 10% DYS390=23 among Native-American Hg R individuals, what plainly contradicts a West-European, and especially a Northwest-European origin). However, the general picture is that the migrational events that are responsible for the Amerindian presence were dominated by a small group of Siberian people whose males belonged especially to the Q-clade. Whatever the genetic and linguistic variety of Amerindian people nowadays, somehow this must relate to this potentially quite homogeneous people that arrived from the Arctic.

This purported ancestral homogeneity, barely 15.000 years ago, is confirmed by low intra-population variety in South America, but how this could be reconciled with the maximum genetic distance with respect to the rest of the world?

(2) Consistently with previous studies, South Amerindians and Oceanian populations show the lowest intra-population diversity when compared with autochthonous populations from other continents. (3) Within South America, Western South Amerindian populations show the highest intra-population diversity, consistently with an evolutionary model previously proposed by us, which suggest a higher long term effective population size and levels of gene flow among them. (4) Comparing the different continents, the between-population diversity is clearly the highest for South Amerindians (Fst = 0.19). This value is almost twice the level of differentiation observed for the worldwide population (Fst =0.11).(E. Tarazona-Santos, 2009)

The human world population attests an overwhelming importance of geneflow: Fst increases proportionally with distance rather than anything else. South America is at the end of the line.

Fst (Fixation Index) measures random admixture levels but whatever this may imply, this extreme South American value can’t be due to mere ancientness of the population. On a world-wide scale, the globally linear increase of Fst with geographic distance doesn’t confirm Heyerdahl-style immigration from weird directions and rather indicate other, more sofisticated processes are at work. The geographic position at the end of all migration lines, ie. more so if indeed traversing the Bering Strait, and a corresponding distance-driven isolation against the effects of geneflow may offer a more plausible explanation.

Could we hypothesize similar causes behind the incredible variety of Amerindian languages? Simple language drift may be responsible. This mechanism defines language as vocabulary cast into the mold of a particular syntax, subject to change. Over time the basic structure of the sentence, held together by functional items (~grammar), with the lexical items (~words) filling in the blanks, will thus be deeply modified, together with the “outer appearance” of a particular language, and affect grammar in all its (morphological and syntactic) aspects. The process may be gradual, the product of chain reactions or subject to cyclic drift, and causes the development of completely different linguistic characteristics of originally related languages that can’t be explained by the mere development in isolation of vocabulary.

In the Americas – including the most southern parts – we are thus confronted with a native population that exceeds the arctic origin location in linguistic and genetic diversity. This is completely contradictory to the common wisdom that homogeneous populations are younger and derived from populations that feature more diversity! So let’s return to the Old World and investigate how this insight might shed new light at the origin of the closely related R-clade.

The descendants of that other brother, that was the progenitor of the R-clade, are difficult to define. If we choose to ignore the possibility that some might have followed their kin of the Q-clade to the Americas through Siberian, we can observe that the R-clade essentially has a more western and southern distribution. The clade is subdivided in several groups that rarely escaped the attention of those that sought an Indo-European association. An unambiguous exception to the Indo-European bias was provided by the Westafrican subclade defined by mutation R1b-V88, that instead was related to the Afrosasiatic linguistic group. More specifically, the Chadic subgroup:

The analysis of the distribution of the R-V88 haplogroup in 41800 males from 69 African populations revealed a striking genetic contiguity between the Chadic-speaking peoples from the central Sahel and several other Afroasiatic-speaking groups from North Africa. The R-V88 coalescence time was estimated at 9200–5600 kya, in the early mid Holocene. We suggest that R-V88 is a paternal genetic record of the proposed mid-Holocene migration of proto-Chadic Afroasiatic speakers through the Central Sahara into the Lake Chad Basin, and geomorphological evidence is consistent with this view. (Cruciani et al., 2010)

Since this subgroup of the R-clade doesn’t derive from the subclades that are most frequently indicated as “Indo-European”, the simple conclusion should be that the R-clade is much older than the Indo-European language family and hence that the success of the Hg R-clade can’t be explained by the Indo-European advance, that indeed has been dated a lot later.

The origin of another important subgroup was recently associated with the Neolithic wave of advance, that is theorized to have started in Turkey:

Does the time to the most recent common ancestor (TMRCA) of the hgR1b1b2 chromosomes support a Paleolithic origin? Mean estimates for individual populations vary (Table 2), but the oldest value is in Central Turkey (7,989 y [95% confidence interval (CI): 5,661–11,014]), and the youngest in Cornwall (5,460 y [3,764–7,777]). The mean estimate for the entire dataset is 6,512 y (95% CI: 4,577–9,063 years), with a growth rate of 1.95% (1.02%–3.30%). Thus, we see clear evidence of rapid expansion, which cannot have begun before the Neolithic period. (Balaresque et al., 2010)

The debate on the origin of virtually all subclades of haplogroup R was repeatedly derailed by the discovery of ancestral samples on disparate locations. Especially the Middle East proved itself a valuable source, to the effect that previous assertions that pointed to Central Asia lost much of their attraction. However, drawing our lessons from the Amerindian situation we could wonder if this change of mindset is justified. A Russian publication (excerpts translated by Google) on the genetic variety of the Bashkir people, just south of the Ural mountain, will possibly make the difference. This study confirms that Bashkir R1b1b2 (also written as R1b-M269), otherwise especially associated to the Neolithic and Europe,  might be very ancient in the region, notwithstanding the exceptionally low diversity.

Baskhir R1b1b2 has an intermediate position between European and Asiatic populations. The largest cluster (γ) occurs equally in all referenced regions (Europe, Balkans, Asia and South Urals), while the β cluster haplotypes correspond predominantly to the populations of Europe, and the majority of haplotypes in cluster α are from South and West Asia.

Some researchers believe that haplogroup R-M269 was spread throughout Western Eurasia in the Upper Paleolithic [Semino et al 2000; Al-Zahery et al. 2003]. The high frequency of the paternal line in populations of the Southern Urals is an unexpected finding, as this haplogroup is not typical for either allied areas (Central Asia, Eastern Europe and Siberia). In order to clarify the origin of this haplogroups in on the southern Urals, we analyzed the phylogenetic relationships between microsatellite haplotypes in populations of Bashkir and West Asia, South Asia and Balkans, Europe (Fig. 4). As a result, phylogenetic analysis identified three clusters microsatellite haplotypes, designated as α, β and γ (Table 6, Fig. 4). It was evident that the bulk of the β cluster haplotypes corresponds to the populations of Europe (50 out of 70 haplotypes cluster β), while the majority of haplotypes in the cluster α (65 of 79 haplotypes) are from South and West Asia.

It should be noted that the largest cluster of the phylogenetic tree, cluster γ haplotypes, occurs equally in all regions (Europe, Balkans, Asia and South Urals). Over 70% of the haplotypes among the Bashkir belong to this cluster, which apparently is the result of the earliest stage (Upper Paleolithic) resettlements of mutation M269 carriers, that cover a larger area.

The inevitable conclusion:

Relatively low population density, and hence the effective population of this region compared to the densely populated regions West Asia, Europe, the Balkans and South Asia prevented the accumulation of high haplotype diversity.

This approach is essentially different from deducting the origin of a subclade – or clade – by only taking in consideration the diversity of surviving haplotypes. Naturally, at the benign living conditions of old cultural areas a high variance can be “accumulated”, when “fossile” haplotypes are “collected” over a large period of time. In the harsh climate of Siberia, however, the living conditions are completely different, and also the survival conditions and subsequent “pruning” of haplotypes. Then the alternative approach for retrieving age should thus consider the diversity in the widest sense, ie. of reminiscent clusters that can be compared with extant clusters on a world wide scale. I gather the “collector” function eg. of the middle east cultural area for “old” R-clade haplotypes has a biological analogy in the jungle, where you can find an enormous variety of species, including “living fossils”. I don’t think this should be explained by a theory saying that all living creatures evolved in the tropics. Variance and diversity of Y-DNA genes thus shouldn’t be taken as an unequivocal indication of origin.

Bashkir R also includes the R1b1b1 subclade, whose Central Asiatic antecedents are rarely disputed, and R1a (R – SRY10831.2). Accordingly, the study concludes:

Since there is no trace of mass migrations from Europe and western Asia, the reasons for the predominance of the main haplogroups (R – SRY10831.2 and R-M269) in the southern Urals are to be found in the processes of early settlement this region.

Possible Holocene migrational nodes of the R1b-clade assuming a correlation to the southern Ural and mtDNA U5b.

The question how the R-clade became involved in the Neolithic expansion and how it also reached Africa will be the sequel of this issue. At the moment it will suffice to point at a possible relation of the Bashkir homelands with the archeological Botai culture, the most likely place where the horse was domesticated and where the first consumption of  (mare)milk was attested. I already mentioned the potential association between the R-clade and gene T-13910 for lactase persistency (including West Africa), what might be especially interesting because Enattah et al. (2007) already located the origin of this gene in the neighborhood of the Ural mountains. Consequently, mitochondrial haplogroup mtDNA-U5b and the autosomal gene associated to Bloodtype B may serve as valuable markers to locate Holocene migrations to West Africa, while the Neolithic wave of advance as a medium for the expansion of the R1b-clade into Europe – due to the low occurence of such migrational markers  on the Neolithic route – could thus be confirmed as essentially characterized by geneflow.

How this processes might have affected also the spread of R1a and R2, the other members of the R-clade, is yet another chapter, being most probably intertwined with that Indo-European question. The issue of the Two Brothers, however,  should be considered without linguistic or cultural implications. The issue of this Siberian Brotherhood is far more important, since the success of the arctic brothers can’t be a coincidence. For sure this issue touches the evolutionary pressures that are especially in force at the conditions of a harsh environment. Hence this issue directly touches our existence. We would never have existed without. Everybody already received all the benefits, since powerful geneflow around the globe continuously forges our species into a single, indivisible brotherhood of mankind.


  • Lobov Artem – Structure of the Gene Pool of Bashkir Subpopulations, 2009, Russian link, Translated excerpt
  • Bolnick et al. – Asymmetric Male and Female Genetic Histories among Native Americans from Eastern North America, 2006, link
  • Lell et al. – The Dual Origin and Siberian Affinities of Native American Y Chromosomes, 2002, link
  • Zegura et al. – High-Resolution SNPs and Microsatellite Haplotypes Point to a Single, Recent Entry of Native American Y Chromosomes into the Americas, 2004, link
  • Bortolini et al. – Y-Chromosome Evidence for Differing Ancient Demographic Histories in the Americas, 2003, link
  • E. Tarazona-Santos et al. – The genetic structure of Native Americans: inferences from SNPs in genes involved in carcinogenesis, immunity and pharmacogenetics, 2009, link
  • Balaresque et al. – A Predominantly Neolithic Origin for European Paternal Lineages, 2010, link
  • Cruciani et al. – Human Y chromosome haplogroup R-V88: a paternal genetic record of early mid Holocene trans-Saharan connections and the spread of Chadic languages, 2010, link (paysite)
  • Enattah et al. – Evidence of Still-Ongoing Convergence Evolution of the Lactase Persistence T-13910 Alleles in Humans, 2007, link

Further reading:

  • Ted Goebel et al. – The Late Pleistocene Dispersal of Modern Humans in the Americas, 2008, link